Wednesday, September 10, 2008

PCI express bandwidth measurements

Benchmarking the PCI express capabilities with CUDA I stumbled across the weird behaviour that a 4 MB block seems to achieve the best sustainable bandwidth. At least when writing to the host.
However, transmitting more than 4 MB but with 4 MB data packets (let's call it blocked copy) does leave a gap in performance.
Although the performance is regained at the end with almost filling the whole GPU memory, the question is what causes the performance to drop to 2GB/s in the first place.

Another interesting question is the jump in performance at 1e6 bytes. Possibly a switch in protocols
Performance of PCI Express transfers to NVIDIA G80 8800 GTX card

No comments:

Post a Comment