Question

I am trying to get a solid strategy together to parse binary data that has embedded integrity symbols. Here are the construction rules in EBNF form:

Log ::= {Data};
Data ::= Key,DataList;
DataList ::= {Structure};

The issue is that Key can appear in the DataList - it's not escape coded. I can't think of anything better than a brute force method where the algorithm is:

-Index all Key locations
-foreach key, start trying to parse Structure
- if structure parse fails - try next key location // possible to lose good data

Does anyone know of a good strategy for doing something like this? I'm trying to keep the data loss to a minimum if there is corrupted records.

Any insight welcome!

Was it helpful?

Solution

What I ended up doing was putting headers on the data, and not just keys. The header has a sync block, crc and length. This makes it fairly fault tolerant. Any corruption would be limited to the message that the corruption is in. The parsing strategy is to locate all the sync blocks, decode the following header and try to parse out the data. A failure indicates either a false positive on the sync block or the record is actually corrupted.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top