Question

Is there any utility like tcpdump in Linux for capturing the traffic which is going over RDMA channel? (Infiniband/RoCE/iWARP)

Was it helpful?

Solution

Old thread, but still:

As Roland pointed out, sniffing RDMA traffic is tricky, because once the endpoints did the initial handshake, traffic goes through network card (HCA) directly to the memory. The only way to sniff this traffic w/o putting a dedicated HW sniffer on the wire is to have vendor-specific hooks in the network card, and a SW tool that uses these hooks.

If you have Mellanox HCAs, you can use the "ibdump" tool. This tool is also a part of Mellanox OFED package.

If you have other vendor's HW, you need to check with that vendor - you won't find any open-source packet sniffer for all RDMA-capable devices, sorry.

OTHER TIPS

In general, no. One of the main characteristics of RDMA is that all the network processing is done on the adapter, without involving the CPU at all. Typically work requests are queued up directly from userspace to the adapter, without any system call. So there's nowhere for a sniffer to hook in to get traffic.

With that said, for Ethernet protocols, iWARP or IBoE (aka RoCE), you can hook up a system in the middle of a connection and set it up to do forwarding in software (eg the Linux bridge module) and then run tcpdump or wireshark to capture the RDMA traffic that passes through this system. Wireshark even has dissectors for iWARP and IBoE.

For native InfiniBand it is theoretically possible to build something similar (set up an adapter to capture and forward traffic) but as far as I know, no one has done even the needed firmware or driver work to do basic packet sniffing.

Chelsio's T4 device supports a packet trace feature allowing it to replicate ingress/egress offload packets to one of the device's NIC queues. Then you can use tcpdump or whatever on that ethX interface to see the RDMA or TOE packets.

Wireshark can be the one. But the problem is you need an observing server. Enabling the mirror feature, you should be able to receive the ROCE pocket at the observer.

A sure way to capture such traffic is to duplicate it into dedicated capture ports. Those ports might be additional ethernet/IB ports (of additional adapters) in your development machine or they may be located in an additional capture machine.

There are basically 2 ways how to duplicate the traffic:

  1. Configure port-mirroring in your switch. Support for port mirroring is pretty common in managed Ethernet switches, even in cheap ones. This feature is also available in some Mellanox Infiniband switches. You can configure to mirror both directions of a port into another one, although this oversubscribes the receiver if the mirrored port receives and sends at line speed at the same time (full-duplex). In such a situation some frames can't be forwarded to the capture port then and are thus dropped. To avoid this limitation one needs to mirror each direction into a separate capture port.

  2. Connect your network cable to a TAP (target access point) device that duplicates or splits the signal. With optical networking those TAPs are often constructed in a completely passive way and thus don't add much complexity and are relatively cheap to produce (examples). You need one TAP for each fiber, i.e. you always occupy 2 capture ports if you want to capture both directions. TAP devices are available for the fibers and connectors commonly used in Ethernet networks. If your Infiniband hardware uses the same then you should be able to use the same TAP devices there, as well. At least the passive ones.

Once the mirrored/tapped traffic arrive at your capture port(s), you can use standard capture tools such as tcpdump.

For Infiniband there is ibdump, however, depending on the Infiniband software you are using (open-source OFED vs. the proprietary Mellanox OFED) and the host channel adapter (HCA) you might be able to use tcpdump to capture Infiniband traffic, as well.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top