Question

So, I have an incoming UDP stream composed of 272 byte packets at a data rate of about 5.12Gb/s (around 320e6 packets per second). This data is being sent by an FPGA-based custom board. The packet size is a limit of the digital design being run, so although theoretically it could be possible to increase it to make things more efficient, it would require a large amount of work. At the receiving end these packets are read and interpreted by a network thread and placed in a circular buffer shared with a buffering thread, which will copy this data to a GPU for processing.

The above setup at the receiving end could cope with 5.12Gb/s for 4096 KB packet (used on a different design) using simple recv calls, however with the current packet size I'm having a hard time keeping up with the packet flow, too much time is being "wasted" in context switching and copying small data segments from kernel space to user space. I did a quick test implementation which uses recvmmsg, however thing didn't improve by much. On average I can processes about 40% of the incoming packets.

So I was wondering whether it was possible to get a handle of the kernel's UDP data buffer for my application (mmap style), or use some sort of zero-copying from kernel to user space? Alternatively, do you know of any other method which would reduce this overhead and be capable of performing the required processing?

This is running on a Linux machine (kernel 3.2.0-40) using C code.

Was it helpful?

Solution

There is support for mmap packet receiving in Linux.

It's not so easy to use as UDP sockets, because you will receive packets like from RAW socket.

See this for more information.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top