Representation of a C binary file [closed]

https://stackoverflow.com/questions/15738984

31-03-2022
|

سؤال

For a homework assignment I created a simple compression/decompression program that makes use of a naive implementation of run-length encoding. I've gotten my program working; compressing and decompressing any text file with a pretty large number of characters (e.g. the program source) works flawlessly. As an experiment I tried to compress/decompress the binary of the compression program itself. This resulted in a file that was much smaller than the original binary, and is obviously un-runnable. What is causing this data-loss?

My assumption was that it's related to how binary files are represented, but I can't figure much out past that.

المحلول

Possible issues:

Your program opens the binary file in the text mode, which damages the '\r' and '\n' bytes
Your program incorrectly handles zero bytes, treating them as ends of strings ('\0') and not as data of its own
Your program uses char (that is actually signed char) for the bytes of data and correctly works only with non-negative values, which ASCII chars of English text are, but fails to work with arbitrary char/byte values, which may be negative
Your program has an overflow somewhere which shows up only on big files
Your program has some other data-dependent bug

نصائح أخرى

If the platform is linux (as the question is tagged), there's no difference between binary and text modes. So it shouldn't be that; but even so, the files should be opened as binary.

I suspect that your problem is the program treats '\0' characters as terminators (or otherwise specially) instead of as valid data.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow