Question

I have 4 GPUs hung off the same PCIe switch (PLX PEX 8747) on a Haswell based system. I want to send the same data to each GPU. Is it possible for the PCIe switch to replicate the data to N targets, rather than do N separate transfers? In effect is it possible to broadcast data to N GPUs over the PCIe bus?

I was wondering how SLI / Crosssfire handled such issues? I can imagine large amounts of data being identical identical for each GPU in a given scene being rendered. I remember reading somehwere that the old NVIDIA 890 Ultra SLI system included this broadcast mechanism in their switch for SLI.

http://www.nvidia.com/docs/IO/52280/NVIDIA_Broadcast_PWShort_TB.pdf

Is this possible with newer PCIe switches?

Update: It appears the PCIe standard supports multi-cast, as outlined by the answer below. I found some info on this at

www.pcisig.com/developers/main/training_materials/get_document?doc_id=31337695e3bc0310ea570c9df49e507b9d3eb4a5

Yes I specifically wanted a CUDA or OpenCL interface to transfer the data to N devices. Seems a shame the API doesn't support this yet.

Was it helpful?

Solution

The PCI-e SIG ratified a scheme for switch level multicast over PCI-e about 5 years ago, and it (I believe) is fully described in the PCI-e 3.0 standard. However, I don't believe any of the GPU/Acceleration vendors support multicast yet, and there certainly isn't any CUDA level API support for such a feature as of CUDA 5.5.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top