I'm writing a program that uses the Netlink protocol to gather task statistics. I'm not getting very far because the kernel responds with an error to what I believe is a valid packet. I've used strace to compare the behaviour of my program with that of iotop that works correctly.
The relevant bit of the strace from iotop:
socket(PF_NETLINK, SOCK_RAW, 16) = 3
setsockopt(3, SOL_SOCKET, SO_SNDBUF, [65536], 4) = 0
setsockopt(3, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0
bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(3, {sa_family=AF_NETLINK, pid=-4286, groups=00000000}, [12]) = 0
sendto(3, "\x24\x00\x00\x00\x10\x00\x01\x00\x01\x00\x00\x00\x42\xef\xff\xff\x03\x00\x00\x00\x0e\x00\x02\x00\x54\x41\x53\x4b\x53\x54\x41\x54\x53\x00\x00\x00", 36, 0, NULL, 0) = 36
recvfrom(3, "\x70\x00\x00\x00\x10\x00\x00\x00\x01\x00\x00\x00\x42\xef\xff\xff\x01\x02\x00\x00\x0e\x00\x02\x00\x54\x41\x53\x4b\x53\x54\x41\x54\x53\x00\x00\x00\x06\x00\x01\x00\x17\x00\x00\x00\x08\x00\x03\x00\x01\x00\x00\x00\x08\x00\x04\x00\x00\x00\x00\x00\x08\x00\x05\x00\x04\x00\x00\x00\x2c\x00\x06\x00\x14\x00\x01\x00\x08\x00\x01\x00\x01\x00\x00\x00\x08\x00\x02\x00\x0b\x00\x00\x00\x14\x00\x02\x00\x08\x00\x01\x00\x04\x00\x00\x00\x08\x00\x02\x00\x0a\x00\x00\x00", 16384, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 112
The corresponding part of the strace output from my program:
bind(8, {sa_family=AF_NETLINK, pid=19156, groups=00000000}, 12) = 0
setsockopt(8, SOL_SOCKET, SO_SNDBUF, [65536], 4) = 0
setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0
sendmsg(8, {msg_name(0)=NULL, msg_iov(5)=[{"\x24\x00\x00\x00\x10\x00\x01\x00\x00\x00\x00\x00\xd4\x4a\x00\x00", 16}, {"\x03\x00\x00\x00", 4}, {"\x0e\x00\x02\x00", 4}, {"\x54\x41\x53\x4b\x53\x54\x41\x54\x53\x00", 10}, {"\x00\x00", 2}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 36
recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\x38\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\xd4\x4a\x00\x00", 16}], msg_controllen=0, msg_flags=MSG_TRUNC}, 0) = 16
If I reformat these, they look a bit like this (as a hex dump):
(Note that these are from different runs so the pid value will be different, but the remainder of the reformatted strace output is the same.)
sent from iotop
24000000 10000100 01000000 42efffff
03000000 0e000200 5441534b 53544154
53000000
received by iotop
70000000 10000000 01000000 42efffff
01020000 0e000200 5441534b 53544154
53000000 06000100 17000000 08000300
01000000 08000400 00000000 08000500
04000000 2c000600 14000100 08000100
01000000 08000200 0b000000 14000200
08000100 04000000 08000200 0a000000
sent from program
24000000 10000100 00000000 d44a0000
03000000 0e000200 5441534b 53544154
53000000
received by program
38000000 02000000 00000000 d44a0000
It seems to me that there are two differences.
iotop seems to use a negative value for the pid.
I tried making the change so that my program also sent a negative number for the pid. This made no difference.
I use a scatter/gather approach: it's less wasteful on memory (which might be constrained in the target PC that I have in mind). However, I suspect that there are some (if not all) Netlink components that only support sending and receiving a single buffer at per request.
Does anyone know if Netlink allows scatter/gather or if it requires all communcations to be done in one large buffer at a time?