GPU Memory Bandwidth Bottlenecks

The cluster discusses how GPU compute performance is primarily limited by memory bandwidth rather than raw processing power, highlighting issues like slow PCIe transfers from system RAM to VRAM, unified memory trade-offs, and potential direct disk-to-GPU copying solutions.

➡️ Stable 0.6x Hardware

3,464

Comments

Years Active

Top Authors

#9675

Topic ID

Activity Over Time

2008

2009

2010

2011

2012

2013

2014

2015

2016

190

2017

189

2018

126

2019

162

2020

330

2021

280

2022

284

2023

535

2024

508

2025

483

2026

Top Contributors

dragontamer (84) kllrnohj (29) monocasa (26) wtallis (26) Const-me (24)

Keywords

RAM BAR CPU SSD DDR4 OK M1 engine.md MUCH SIMD gpu memory bandwidth ram cpu bottleneck gpus data vram pcie

Sample Comments

sva_ • Nov 13, 2023 • View on HN

It is remarkable how much GPU compute is limited by memory speed.

rp1 • Dec 19, 2021 • View on HN

True, bandwidth is a concern. It might be possible to copy memory directly from disk to the GPU, which may be more efficient.

aappleby • Aug 25, 2024 • View on HN

You'll be limited by memory bandwidth more than compute.

bronxbomber92 • Sep 13, 2020 • View on HN

How much would the trade offs change if GPUs shared the same main memory as CPU?

BoredPositron • Aug 30, 2025 • View on HN

Working with the vram of both gpus on the same card is aeons faster than taking a system ram roundtrip.

jakelarkin • Oct 6, 2018 • View on HN

probably the processor-memory bus. GPUs have wider ones

buildbot • Nov 3, 2023 • View on HN

Your PC can't use the GPU memory bandwidth for the CPU whatsoever. So why would you add the bandwidth?

p1esk • Nov 29, 2019 • View on HN

But you don't want to do that. The bandwidth between GPU and system RAM is 10x less than the bandwidth between GPU and GPU RAM, so the GPU would spend > 90% of time waiting for data. In this scenario, using CPU would most likely be faster.

z3phyr • Mar 14, 2013 • View on HN

Though, GPUs are meant for highly specialized number-crunching, they doesn't have direct access to system memory. One also gets additional latency transferring data between RAM and GPU memory.Also, GPU is a temporary technology. Wait for some time, and see CPUs getting more GPU like.

bsenftner • Aug 26, 2025 • View on HN

It's not just the GPU memory, it's also I/O memory. That speeds up a lot: just update the pointer to where the memory is, no copying out of I/O memory.