GPU VRAM for LLMs

Discussions center on GPU memory (VRAM) requirements and suitable hardware like RTX 3090, 4090, A100 for running large language models locally, including inference, training, multi-GPU setups, and consumer vs enterprise options.

➡️ Stable 0.8x AI & Machine Learning
4,005
Comments
14
Years Active
5
Top Authors
#9588
Topic ID

Activity Over Time

2013
3
2014
5
2015
19
2016
38
2017
57
2018
52
2019
48
2020
133
2021
99
2022
334
2023
1,132
2024
937
2025
1,095
2026
53

Keywords

RAM LOT CPU LLM Q8 H100 RTX3060 GB VRAM GPU vram rtx gpu gb nvidia memory models gpus inference bandwidth

Sample Comments

etaioinshrdlu May 3, 2022 View on HN

Can 3090 GPUs share their memory with one another to fit such a large model? Or is the enterprise grade hardware required?

rapfaria Aug 5, 2025 View on HN

Not sure what you mean or new to llms, but two RTX 3090 will work for this, and even lower-end cards will (RTX3060) once it's GGUF'd

wkat4242 Sep 21, 2024 View on HN

True but try to find a 96GB GPU.

bestouff Feb 3, 2025 View on HN

Out of curiosity, would an A100 80GB work for this ?

Rastonbury Nov 10, 2024 View on HN

GPU VRAM is the bottleneck currently, check out r/localLlama for benchmarks and calculators for what models can fit into what cards approximately

lostmsu Oct 25, 2021 View on HN

Training or inference? How's training performance compared to 8GB NVIDIA, if you have one?

dharma1 Jul 22, 2016 View on HN

might be relevant if your model size is large - two gtx 1080's would give you twice the gpu RAM

sisjohn Sep 27, 2021 View on HN

Any suggestions on what GPU to use to train large models?

dc443 Jul 25, 2023 View on HN

I have 2x 3090 do you know if it's feasible to use that 48GB total for running this?

vegesm Jun 11, 2020 View on HN

probably the memory requirements mean that you do need (multiple) V100s though