Two years ago, running an LLM on-device was a science project. In 2026 it is a deploy target. Both Google and Apple ship first-party on-device models with public APIs, the RAM ceiling finally fits 2-4B parameter models, and power draw is defensible for features you run a few times a minute.
What we ship on-device today
- Smart reply drafting in chat apps · 80-150ms, no spinner.
- Receipt / invoice field extraction · fully offline, GDPR-trivial.
- Photo caption + search index on-device.
- Meeting transcription + bullet summary (with Whisper.cpp + a local summariser).
Gemini Nano · where it wins
- Available on a much wider device matrix · Android 15+ with 8GB+ RAM.
- AICore handles model updates without a playstore ship.
- Summarisation + rewriting APIs are stable and predictable.
- Works better with languages beyond English than Apple Intelligence on mid-range hardware.
Apple Intelligence · where it wins
- A17 Pro / M-series only · narrower matrix, but the models are meaningfully better.
- The Writing Tools API is a drop-in replacement for a cloud call · zero glue code.
- Private Cloud Compute fallback is automatic and audit-friendly.
- Foundation-models API surface is more coherent · one SDK, not three.
Hungarian-language reality check
Both models underperform on Hungarian vs English. In our evals Gemini Nano gives ~85% acceptable-output rate on Hungarian summarisation, Apple Intelligence ~80%. For comparison, Claude 3.7 Haiku is ~97%. For Hungarian-heavy features we keep a cloud fallback for now.
When we still call the cloud
- Any agentic flow with tool calls · on-device tool-use is fragile.
- Long-context tasks (> 8k tokens effective) · on-device context windows are still small.
- Safety-critical outputs · medical, legal, financial advice · we route to a policy-gated cloud call with audit logs.
- Multilingual features where non-English quality matters for conversion.
Always design the UI for a cloud fallback from day one. The right on-device feature feels instant when the model is present and works anyway when it is not.

By
Dezso Mezo
Founder, DField Solutions
I've shipped production products from fintech to creator-tooling · for startups and enterprises, from Budapest to San Francisco.
Keep reading
RELATED PROJECTS