سؤال

I am working on a piece of software that analyzes E01 bitstream images. Basically these are forensic data files that allow a user to compress all the data on a disk into a single file. The E01 format embeds data about the original data, including MD5 hash of the source and resulting data, etc. If you are interested in some light reading, the EWF/E01 specification is here. Onto my problem:

The e01 file contains a "table" section which is a series of 32 bit numbers that are offsets to other locations within the e01 file where the actual data chunks are located. I have successfully parsed this data out into a list doing the following:

this.ChunkLocations = new List<int>();
//hack:Will this overflow?  We are adding to integers to a long?
long currentReadLocation = TableSectionDescriptorRef.OffsetFromFileStart + c_SECTION_DESCRIPTOR_LENGTH + c_TABLE_HEADER_LENGTH;
byte[] currReadBytes;
using (var fs = new FileStream(E01File.FullName, FileMode.Open))
      {
      fs.Seek(currentReadLocation, 0);
      for (int i = 0; i < NumberOfEntries; i++)
                {
                    currReadBytes = new byte[c_CHUNK_DATA_OFFSET_LENGTH];
                    fs.Read(currReadBytes,0, c_CHUNK_DATA_OFFSET_LENGTH);
                    this.ChunkLocations.Add(BitConverter.ToUInt32(currReadBytes, 0));
                }
       }

The c_CHUNK_DATA_OFFSET_LENGTH is 4 bytes/ "32 bit" number.

According to the ewf/e01 specification, "The most significant bit in the chunk data offset indicates if the chunk is compressed (1) or uncompressed (0)". This appears to be evidenced by the fact that, if I convert the offsets to ints, there are large negative numbers in the results (for chunks without compression,no doubt), but most of the other offsets appear to be correctly incremented, but every once in a while there is crazy data. The data in the ChunkLocations looks something like this:

346256
379028
-2147071848
444556
477328
510100

Where with -2147071848 it appears the MSB was flipped to indicate compression/lack of compression.

QUESTIONS: So, if the MSB is used to flag for the presence of compression, then really I'm dealing with at 31 bit number, right?
1. How do I ignore the MSB/ compute a 31 bit number in figuring the offset value?
2. This seems to be a strange standard since it would seem like it would significantly limit the size of the offsets you could have, so I'm questioning if I'm missing something? These offsets to seem correct when I navigate to these locations within the e01 file.

Thanks for any help!

هل كانت مفيدة؟

المحلول

This sort of thing is typical when dealing with binary formats. As dtb pointed out, 31 bits is probably plenty large for this application, because it can address offsets up to 2 GiB. So they use that extra bit as a flag to save space.

You can just mask off the bit with a bitwise AND:

const UInt32 COMPRESSED = 0x80000000;   // Only bit 31 on

UInt32 raw_value = 0x80004000;          // test value

bool compressed = (raw_value & COMPRESSED) > 0;
UInt32 offset = raw_value & ~COMPRESSED;

Console.WriteLine("Compressed={0}  Offset=0x{1:X}", compressed, offset);

Output:

Compressed=True  Offset=0x4000

نصائح أخرى

If you just want to strip off the leading bit, perform a bitwise and (&) of the value with 0x7FFFFFFF

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top