LLMs on Apple Silicon

Discussions center on running large language models (LLMs), Stable Diffusion, and other AI tasks locally on Apple Silicon Macs (M1/M2/M3/M4), highlighting performance, unified memory advantages, RAM requirements, tools like MLX and Ollama, and comparisons to NVIDIA GPUs.

➡️ Stable 0.7x AI & Machine Learning

2,822

Comments

Years Active

Top Authors

#80

Topic ID

Activity Over Time

2012

2014

2016

2017

2018

2019

2020

2021

2022

318

2023

837

2024

627

2025

780

2026

Top Contributors

simonw (47) mark_l_watson (34) smoldesu (21) brucethemoose2 (19) seanmcdirmid (16)

Keywords

RAM CPU BLAZINGLY LLM MPS ARM M1 OK PyTorch OSS m1 apple ram models apple silicon mac gpu nvidia run macbook

Sample Comments

dartos • Mar 27, 2024 • View on HN

Newer Apple computers have unified gpu and cpu memory, so they are BLAZINGLY fast with local models.

arvinsim • Mar 21, 2023 • View on HN

Just the comment that I was looking for. Guess I can run it on my M1 Pro 32GB model

simonw • Apr 20, 2025 • View on HN

MacBook Pro M2 with 64GB of RAM. That's why I tend to be limited to Ollama and MLX - stuff that requires NVIDIA doesn't work for me locally.

DeveloperErrata • Oct 22, 2024 • View on HN

Macbook Pros with M3 & integrated RAM & VRAM can do 70B models :)

PeterStuer • Aug 3, 2025 • View on HN

LMstudio seems to have MLX support on Apple silicon so you could quickly have a feel for whether it helps in your case https://github.com/lmstudio-ai/mlx-engine

murkt • Aug 10, 2022 • View on HN

Can it be run on Apple M1/M2, if they have 16+ GB of RAM?

mrtksn • Jan 7, 2024 • View on HN

That's Apple Silicon territory.It's always amusing how all these LLM or diffusion models work on some beefy Nvidia GPU's or Macbook Air M1's with 16GB of RAM because all that RAM can be utilised by the GPU or the Neural Engine.

nico • Dec 9, 2025 • View on HN

Has anyone done something like this but with apple silicon instead of a graphics card? Training a small LLM on an M2-M5?

wsgeorge • Jan 21, 2025 • View on HN

You should be able to run smaller models on an M1. I'm testing this in about 10mins

evilduck • Apr 16, 2024 • View on HN

Running your larger-than-your-GPU-VRAM LLM model on regular DDR ram will completely slaughter your token/s speed to the point that the Mac comes out ahead again.