LLMs on Apple Silicon
Discussions center on running large language models (LLMs), Stable Diffusion, and other AI tasks locally on Apple Silicon Macs (M1/M2/M3/M4), highlighting performance, unified memory advantages, RAM requirements, tools like MLX and Ollama, and comparisons to NVIDIA GPUs.
Activity Over Time
Top Contributors
Keywords
Sample Comments
Newer Apple computers have unified gpu and cpu memory, so they are BLAZINGLY fast with local models.
Just the comment that I was looking for. Guess I can run it on my M1 Pro 32GB model
MacBook Pro M2 with 64GB of RAM. That's why I tend to be limited to Ollama and MLX - stuff that requires NVIDIA doesn't work for me locally.
Macbook Pros with M3 & integrated RAM & VRAM can do 70B models :)
LMstudio seems to have MLX support on Apple silicon so you could quickly have a feel for whether it helps in your case https://github.com/lmstudio-ai/mlx-engine
Can it be run on Apple M1/M2, if they have 16+ GB of RAM?
That's Apple Silicon territory.It's always amusing how all these LLM or diffusion models work on some beefy Nvidia GPU's or Macbook Air M1's with 16GB of RAM because all that RAM can be utilised by the GPU or the Neural Engine.
Has anyone done something like this but with apple silicon instead of a graphics card? Training a small LLM on an M2-M5?
You should be able to run smaller models on an M1. I'm testing this in about 10mins
Running your larger-than-your-GPU-VRAM LLM model on regular DDR ram will completely slaughter your token/s speed to the point that the Mac comes out ahead again.