Getting Started
This site is built around one rule: start with your device, not with benchmark charts.
Before you download anything
Section titled “Before you download anything”Answer these four questions first:
- Is your target device a phone, tablet, laptop, or desktop?
- How much RAM or unified memory does it actually have?
- Do you want the easiest UI, or the most control?
- Are you optimizing for speed, quality, or just proving that it runs?
Fastest paths by device
Section titled “Fastest paths by device”- Android: start with AI Edge Gallery.
- iPhone and iPad: start with the iPhone and iPad guide.
- Mac: start with the Mac guide and decide between LM Studio and llama.cpp.
- Windows: start with the Windows guide and check VRAM before anything else.
Fastest paths by runtime preference
Section titled “Fastest paths by runtime preference”- I want the easiest desktop app: LM Studio
- I want the most control: llama.cpp
- I want a local API: Ollama
- I want a phone-first experience: AI Edge Gallery
Recommended order
Section titled “Recommended order”- Identify your device class and memory budget.
- Use the model picker to narrow the Gemma 4 size.
- Choose the easiest runtime that supports that hardware well.
- If it fails, go straight to troubleshooting instead of trying random downloads.
What “offline” actually means here
Section titled “What “offline” actually means here”Running Gemma 4 locally means inference is happening on your device. It does not mean the model has live web access. The official Gemma 4 model card lists the training cutoff as January 2025.
Next: pick a model size Use E2B, E4B, 26B A4B, or 31B based on hardware and intent.