Protecting Structure from corruption

https://softwareengineering.stackexchange.com/questions/347478

11-01-2021
|

Question

I am developing a safety critical embedded system and I am programming in C.

We have a set of const structure declared in memory and they hold some critical data. I want to make sure that any of these structure is not corrupted.

One option is to add a CRC field in the structure. But problem is I have to manually calculate CRC and add as a element value in the structure, which is not a good idea.

By what way I can make sure that these structures are not modified/corrupted and good to use?

Solution

Firstly, check that you comply with any legal or regulatory requirements. There may be industry specific standards that you are required to follow and which may specify exactly how to approach your current issue.

Secondly, even if no particular standard is legally required in your field, it may be a good idea to select an appropriate standard and comply with it anyway. Again, if it has a recommendation on your current issue, follow it.

Thirdly, "never go to sea with two chronometers". Consider two scenarios: A) the critical data is corrupted; B) the stored CRC field is corrupted. Both present the same symptom: the CRC no longer matches the data. How do you proceed? In some applications it may be possible to reload the data. In others it may be safe to gracefully terminate the program. In others, such an ambiguity could lead to loss of life.

One very simple approach is to use three separate values. If one of these values differs from the other two then, firstly, you should signal/record an error. Then, if safe to do so, gracefully abort the current activity. If not safe to terminate the program, you can take the two identical values as the most likely true value, reasoning that two simultaneous errors are significantly less likely than one, and that two simultaneous errors that produce exactly the same result are significantly less likely even than that.

The "majority decision" can be retrieved fairly efficiently using bitwise logic. For A, B, C, take the AND of each of the 3 possible order-independent pairings A&B, B&C, C&A. One of these pairs will hold the two "correct" values and also evaluate to the correct value. The other two pairs will have one correct value and the "erroneous" value. Taking the AND means that, while we may have the odd 0 where there should be a 1, there are no 1s where there should be a 0. We can now OR (or XOR) the three results to recover the majority decision. One particularly useful consequence of such an approach is that there is no branching to a separate recovery code-path in the error case, and that there is no expected slowdown relative to the error-free case. Depending on the time-criticality of the application, a slower fixed time-step may be easier (safer) to handle than a sometimes-faster, sometimes-significantly-slower time-step.

(I do appreciate that you say this is for an embedded system, and that the idea of replicating your data 3-fold may seem crazy. However, for genuinely safety critical applications, if you don't have the memory, it may be cheaper to buy larger memory now than pay out compensation later.)

OTHER TIPS

Make sure this constant data is stored in the program image in ROM. This will ensure they can't be modified at runtime. If you need to protect against flash corruption then you should add CRC protection to your entire program image in flash. Bad things can happen if your data or your code get corrupted. This check should be done in your boot loader.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange