문제

I believe that the TCP checksum function does the following:

  1. Break up the pseudoheader and the TCP segment header and data into 2 byte blocks.
  2. Add a one byte padding of 0s to the end of the last block if it's not 2 bytes long, to make it 2 bytes.
  3. Take the one's complement of the sum to get the TCP checksum.

Sounds simple enough. Hence I wrote my own generic checksum function:

#include <inttypes.h>
#include <arpa/inet.h>

uint16_t checksum(uint16_t * data, int size) {
    uint16_t sum = 0;

    int i = 0, length = size / 2;

    while (i < length) sum += data[i++];

    if (size % 2) sum += data[i] & 0xFF00;

    return htons(~sum);
}

However other people have written checksum functions which seem to be more complicated. For example:

uint16_t checksum(uint16_t * addr, int len) {
    int nleft = len;
    int sum = 0;

    uint16_t * w = addr;
    uint16_t answer = 0;

    while (nleft > 1) {
        sum += *w++;
        nleft -= sizeof(uint16_t);
    }

    if (nleft == 1) {
        *(uint8_t *) (&answer) = *(uint8_t *) w;
        sum += answer;
    }

    sum = (sum >> 16) + (sum & 0xFFFF);
    sum += (sum >> 16);
    answer = ~sum;
    return (answer);
}

I have a few questions regarding this code:

  1. What does the statement *(uint8_t *) (&answer) = *(uint8_t *) w; actually do?
  2. Why do we take the sum as:

    sum = (sum >> 16) + (sum & 0xFFFF);
    sum += (sum >> 16);
    
  3. Did the way to calculate the TCP checksum change?

I really don't see why we do sum = (sum >> 16) + (sum & 0xFFFF). Consider sum is 0xABCD:

0xABCD >> 16    == 0x0000

0xABCD & 0xFFFF == 0xABCD

0x0000 + 0xABCD == 0xABCD

It seems like a redundant step. Same goes for the next statement sum += (sum >> 16).

도움이 되었습니까?

해결책 2

The checksum function appears to be for big-endian processors only.

The first while loop is optimized for speed.

The &answer trick loads the last byte (if there were an odd number of bytes) into the high byte of answer, leaving the low byte zero, similar to what your code does with data[i] & 0xff00. The way it works is this

1) take the address of answer      (&answer)
2) convert that to a byte pointer  (uint8_t *)  
2a) on a big endian processor the first byte of a 16-bit quantity is the high byte
3) overwrite the high byte with the last byte of the data

The checksum is supposed to be computed with the carries added back in. It's assumed here that this code is running on a machine where an int is 32-bits. Therefore, (sum & 0xffff) is the 16-bit checksum, and (sum >> 16) are the carry bits (if any) that need to be added back in. Hence, the line

sum = (sum >> 16) + (sum & 0xffff);

adjusts the sum to include the carries. However, that line of code could itself generate another carry bit. So the next line sum += (sum >> 16) adds that carry (if any) back into the checksum.

Finally, take the ones-complement of the answer. Note that htons is not used since the whole function implicitly assumes that it is running on a big endian processor.

다른 팁

What does the statement *(uint8_t *) (&answer) = *(uint8_t *) w; actually do?

This casts uint16_t to uint8_t, so only 8 most-right bits are copied from w into answer. Consider:

uint16_t x = 0x1234;
uint16_t* w = &x; // *w = // 0001001000110100

*(uint16_t *) (&answer) = *(uint16_t *) w; // answer = 0001001000110100

*(uint8_t *) (&answer) = *(uint8_t *) w;   // answer = 0000000000110100

Why do we take the sum as:

sum = (sum >> 16) + (sum & 0xFFFF);
sum += (sum >> 16);
answer = ~sum;

The sum is 32 bits. 65536 ≡ 1 mod 65535, so the end-around carry expression (sum & 0xffff) + (sum >> 16) reduces sum modulo 65535. This is necessary to add any (eventual) resulting carry back into the resulting sum.

  1. *(uint8_t *) (&answer) = *(uint8_t *) w; On the right side, it converts w to a uint8_t* and dereferences it. It truncates the garbage data that would be read when dereferencing uint16_t* pointing to the last byte. On the left side, it takes the address (pointer) of answer and converts it to uint8_t* and dereferences it. So it takes the first byte pointed by w and assigns the value to the first byte of answer. In effect, this line does the 2. Add a one byte padding of 0s to the end of the last block if it's not 2 bytes long, to make it 2 bytes. The conversions on the left side are needed to support big endian systems... I think.
  1. This statement accommodates the case (see RFC793 or RFC1701) where the packet has an odd number of bytes: [A,B] + [C,D] + ... + [Z,0] by incorporating into the sum a quantity (answer) with the 2 most significant bytes as Z and the 2 least significant bytes as 0. Remember + here is always 1's complement addition.

  2. sum is a 32-bit accumulator. To add in 1's complement, we add the carry back in after accumulating bits. The 2 most significant bytes of sum contain the carry bit(s), if any.

  3. If you take a look at RFC1701 you can see at the top which RFCs update it. There are none that supersede it.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top