I've made three mistakes.
The first one is that skb can be nonlinear, which will cause the checksum got from csum_partial() be incorrect.
The second one is that I use csum_tcpudp_magic() to get the checksum, but forgot to change skb->ip_summed, so the NIC will use my correct checksum as the partial checksum of tcp pseudo-header to recalculate the checksum, leading checksum incorrect in the packet.
The third mistake is that my wireshark seems to be set to ignore the tcp checksum, and it always shows the packets with wrong checksum as good ones, while tcpdump will tell me incorrect checksums.