Question

First, I apologize for any english mistakes I'll make, but being 15 and french doesn't help...

I'm trying to program a PNG decoder with the help of the file format specification (http://www.libpng.org/pub/png/spec/1.2/PNG-Contents.html) but i came across a weird problem.

The specification says that the first eight bytes of a PNG file always contain the following (decimal) values: 137 80 78 71 13 10 26 10.

When I test this simple program :

int main() 
{
    ifstream file("test.png");

    string line;
    getline(file, line);

    cout << line[0] << endl;
}

The output is "ë" which represents 137 in the ascii table. Good, it matches the first byte.

However, when I do int ascii_value = line[0];, the output value is -119, which is not a correct ascii value.

When I try the same thing with another character like "e", it does output the correct ascii value.

Could someone explains what am I doing wrong and what is the solution ? I personally think it's an issue with the extended ascii table, but I'm not sure.

Thank you everybody ! I'll cast my signed char to an unsigned one !

Was it helpful?

Solution

Your system's char type is signed, which is why values thereof can be negative.

You need to be explicit and drop the sign:

const unsigned char value = (unsigned char) line[0];

Note that -119 = 137 in two's complement which your machine seems to be using. So the bits themselves really are correct, it's all about interpreting them properly.

OTHER TIPS

char in C++ can be both signed or unsigned1), it’s up to the implementation which it is. In the case of your compiler (as in most, actually), it appears to be signed:

Any character value > 128 is represented as a negative number. -119 happens to correspond to the unsigned character value 137. In other words, the following holds:

unsigned char c = 137;
assert(static_cast<signed char>(c) == -119);

But note that this is implementation-specific so you cannot in general rely on these values.


1) And is a distinct type from both signed char and unsigned char.

ASCII only covers 0 .. 127. There is no 137 in the ASCII table.

There is no such thing as "the extended ASCII table" either. There are dozens of (mutually incompatible) ASCII extensions. Heck, technically even Unicode is "extended ASCII".

You're getting -119 because in your compiler char is a signed type, covering values from -128 to 127. (-119 is 137 - 256). You can get the value you expect by explicitly casting to unsigned char:

int value = static_cast<unsigned char>(line[0]);

That's what happens when you allow sign extension. Characters in the extended ASCII table have their high bit (sign bit) set.

-119 is 0x89. 137 is also 0x89.

Try

int ascii_value = line[0] & 0x00FF;

or

int ascii_value = (unsigned char)line[0];

137 = -119 = 0x89. If you cast (unsigned) (unsigned char)(line[0]), you'll get the it to print the integer value of 137.

The type char (which is the base type for std::string) is [usually] a signed value, ranging from -128-127. Anythiung higher than 127 will be a negative number.

C++ does not specify whether char is a signed or unsigned type. This means that "extended" ASCII characters (those outside the range 0..127, with their top bit set) might be interpreted as negative values; and it looks like that's what your compiler does.

To get the unsigned value you're expecting, you'll need to explicitly convert it to an unsigned char type:

int ascii_value = static_cast<unsigned char>(line[0]); // Should be 137
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top