Strategy for Binary File Format Description to C++ Implementation
https://softwareengineering.stackexchange.com/questions/322867
-
19-12-2020 - |
Pregunta
I am dealing with a lot of legacy, reverse engineered binary file formats, often with lost source code and reading/writing these files needs to be recoded in C++.
I am wondering if there are good examples or ideas on simplyfing the process of converting documentation of the file format into code with goal being to load data into a class that can be loaded/saved/processed.
From current investigation into the issue I think boost serialization may be one of the best options ( http://www.boost.org/doc/libs/1_61_0/libs/serialization/doc/ ) Although not sure if there is a simpler way just using C++ and STL?
I am mostly concerned about the ease of describing the data, and minimizing rework for each new type of binary file format being worked on.
Solución
I am wondering if there are good examples or ideas on simplyfing the process of converting documentation of the file format into code with goal being to load data into a class that can be loaded/saved/processed.
This can be solved at multiple levels:
you can use boost::spirit parsing, or a custom serializer/deserializer (as suggested in the comments)
you can hide the implementation behind a custom set of boost::iostream device buffer types.
I am mostly concerned about the ease of describing the data, and minimizing rework for each new type of binary file format being worked on.
I would do this by creating some custom types that map i/o bytes to semantic information, transparently to the user:
/// map custom file header info into BlaBla information
class BlaBlaHeaderField
{
std::uint32_t binary_header;
BlaBlaHeaderField(std::uint32_t binary_header) { ... }
/// custom property (interprets individual bytes)
int BlaBlaParity() { return (binary_header & 0x01); }
};
This way, the format will be close to self-documenting from the code, later.
You can also use a union and overlay the fields with an integer/long/whatever.