GPU Memory Bandwidth Bottlenecks
The cluster discusses how GPU compute performance is primarily limited by memory bandwidth rather than raw processing power, highlighting issues like slow PCIe transfers from system RAM to VRAM, unified memory trade-offs, and potential direct disk-to-GPU copying solutions.
Activity Over Time
Top Contributors
Keywords
Sample Comments
It is remarkable how much GPU compute is limited by memory speed.
True, bandwidth is a concern. It might be possible to copy memory directly from disk to the GPU, which may be more efficient.
You'll be limited by memory bandwidth more than compute.
How much would the trade offs change if GPUs shared the same main memory as CPU?
Working with the vram of both gpus on the same card is aeons faster than taking a system ram roundtrip.
probably the processor-memory bus. GPUs have wider ones
Your PC can't use the GPU memory bandwidth for the CPU whatsoever. So why would you add the bandwidth?
But you don't want to do that. The bandwidth between GPU and system RAM is 10x less than the bandwidth between GPU and GPU RAM, so the GPU would spend > 90% of time waiting for data. In this scenario, using CPU would most likely be faster.
Though, GPUs are meant for highly specialized number-crunching, they doesn't have direct access to system memory. One also gets additional latency transferring data between RAM and GPU memory.Also, GPU is a temporary technology. Wait for some time, and see CPUs getting more GPU like.
It's not just the GPU memory, it's also I/O memory. That speeds up a lot: just update the pointer to where the memory is, no copying out of I/O memory.