Question

I am writing currently a Flac-Decoder and so I have to read 2 UTF8 values encoded in the flac-header. This is in the documentation:

if(variable blocksize)
   <8-56>:"UTF-8" coded sample number (decoded number is 36 bits)
else
   <8-48>:"UTF-8" coded frame number (decoded number is 31 bits) 

They use a selfmade function in their bitreader file Bitreader (line 1327) for the bigger UTF8 (variable blocksize).

I ve took a look on it and it is not that very nice code to translate into c#. So I thought about using the binaryreader with UTF8 Encoding and read with this method: ReadUint64. Is it possible that this works? Is it the same result and what would be the absolutly fastest solution?

Was it helpful?

Solution

No, that will not work. ReadUInt64 will just read 8 bytes; the encoding is only used for reading actual text - i.e. ReadChar and ReadChars - and those will also not work, since the char type is only 16-bit, and neither of those would expect a 36-bit value anyway.

When they write "UTF8 coded" in your documentation, that doesn't mean it's true UTF-8 - it just means they encode a number using the same principle as is used by UTF-8 to encode characters (which are, after all, also just numbers, but with more complex restrictions).

If you look at Wikipedia, you will see that they have listed exactly how UTF-8 characters are encoded, for up to 31 bits. It is very straightforward to continue this sequence for a 36-bit value - in that case, the first byte would be 11111110 in binary - and that's what you're supposed to do for the sample numbers.

While you may not think the code is nice, that's pretty much the most sensible way to do it - you're not going to avoid bit manipulation anyway, because of how UTF-8 works - and while it is certainly possible to make some variations on that exact code, the basic structure is unlikely to be very different.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top