Skip to content

llama.cpp

llama.cpp is the path for users who want:

  • clear control over runtime behavior,
  • predictable local inference,
  • and a setup that is easier to reason about when something breaks.
  • Mac and desktop users
  • users who care about repeatability
  • users who have already outgrown one-click apps
  • you are brand new to local AI,
  • or your goal is just to see Gemma 4 run once

In that case, start with LM Studio or AI Edge Gallery.