News

Local AI vs Cloud AI: What Australians Need to Know About On-Device Models in 2026

Priya RamanJune 18, 2026

0 12 7 minutes read

For a couple of years now the AI conversation has been split into two camps: people who treat ChatGPT and Gemini and Copilot as a kind of utility you tap into over the internet, and people who get nervous every time they send a sensitive document to a server in the United States. In 2026, that split has finally produced an actual choice. The on-device AI models built into the new generation of phones and laptops — Apple Intelligence, Microsoft Copilot+, Google’s Gemini Nano, and a growing pile of capable open-weights models you can run yourself — are good enough that “do I have to use the cloud for this?” is, for the first time, a genuine question with a genuine answer.

Priya runs our AI desk and has been bouncing between cloud and local models for the better part of a year. The honest take from her side of the office is that for most everyday tasks, on-device AI is now good enough and the privacy tradeoffs swing decisively in its favour. The trickier work — anything that needs encyclopaedic knowledge, real-time information, or genuinely hard reasoning — still belongs in the cloud, at least for now. The rest of this piece is the practical version of that conclusion, written for Australians who care about where their data ends up.

What people actually mean by “local AI” in 2026

The phrase covers a few different things and it’s worth pulling them apart, because the privacy and capability picture is very different across them.

System-level on-device models. Apple Intelligence on iPhone 16-series and newer Macs, Gemini Nano on Pixel 9-series and a growing list of Android flagships, and Microsoft’s Copilot+ stack on Snapdragon X / Intel Lunar Lake / AMD Ryzen AI laptops. These are small language models (roughly 3–8 billion parameters) baked into the operating system and tuned to run on dedicated NPU silicon.
App-level local models. Tools like Ollama, LM Studio, and Jan let you download open-weights models — Meta’s Llama 3 family, Mistral’s smaller releases, Google’s Gemma, Microsoft’s Phi-4 — and run them locally on any reasonably modern Mac or PC with enough RAM. Setup is now genuinely friendly; we’d say it’s at the “weekend learning project” level of difficulty, not the “you need a PhD” level it was in 2023.
Hybrid cloud-local stacks. Apple’s “Private Cloud Compute” is the cleanest example — when your device decides a task is too big for the local model, it offloads it to Apple’s servers, but with cryptographic guarantees about what the server can and cannot retain. Microsoft is rolling out something similar for Copilot+ in 2026. These hybrids are deliberately designed to behave like local processing from a privacy point of view, even when work happens off-device.

The boundary between these matters. A genuine on-device query never touches a network. A hybrid query might, but with strong constraints. A cloud query goes to a US server, gets logged, and is subject to that company’s privacy policy and the laws of wherever the server lives.

What on-device models can actually do well in 2026

The capability story has moved a long way fast. As of right now, an 8-billion-parameter model running on a 2025-or-newer laptop or phone can comfortably handle:

Drafting and rewriting emails, messages, social posts and short blog drafts
Summarising long documents, PDFs, transcripts and meeting notes
Pulling structured data out of receipts, invoices and forms
Answering reference questions about your own files (with the right RAG plumbing, which the system models increasingly do for you)
Generating code snippets in mainstream languages
Live transcription, translation between major languages, and voice cleanup
On-device photo edits and “remove the bins from the background” magic

Where they still fall down: anything that needs current information (a local model has no internet by default), anything that benefits from a long chain of careful reasoning, anything requiring deep specialist knowledge that the small model just didn’t get trained on, and high-stakes generation where you really do want the smartest possible model. For those, the cloud is still the right answer.

If you’ve recently bought a flagship phone or new laptop, you almost certainly already have a capable on-device model sitting there ready to use. We covered the broader on-device AI story in our piece on getting more out of your phone in general, where the same NPU silicon doing the AI work also has implications for battery and thermal behaviour.

The privacy picture, plainly

The reason most Australians should care about local AI isn’t technical — it’s that the cloud models, by default, see every prompt you send them and most have some form of retention and review policy. For the typical home user that’s a “probably fine” situation. For anyone handling health information, legal matters, client data, employee information or anything covered by the Privacy Act, “probably fine” is not the answer your compliance officer wants.

The Office of the Australian Information Commissioner has been steadily firmer in its guidance on generative AI, particularly around the obligations of APP entities to understand where their data is going and what the third-party AI provider does with it. The OAIC’s plain reading is that pasting a client’s personal information into a third-party chatbot is, in many cases, a disclosure to an overseas recipient — with all the obligations that implies.

For consumers, the eSafety Commissioner has highlighted the parallel risk: kids and teens chatting freely with cloud AI assistants are producing transcripts that exist on servers somewhere, are searchable, and have already shown up in court matters overseas. None of that means cloud AI is bad. It means it has consequences a lot of people don’t think about, and that on-device alternatives are a meaningful mitigation.

The practical heuristic Priya uses: if you would be uncomfortable with your prompt being read out at a tribunal hearing, run it locally or don’t run it at all.

What this changes for small businesses

This is the part of the conversation that has shifted fastest. A year ago, a small accounting firm or law practice wanting to use AI faced an awkward choice: cloud tools that were efficient but introduced data-handling questions, or no AI at all. In 2026 there’s a credible middle path.

The shape of it tends to look like this: a couple of reasonably-specced laptops in the office (Apple M4 Macs or any of the new Copilot+ Windows machines), each running a local model via Ollama or the built-in system stack. Document drafting, contract summarisation, basic research, meeting transcription and reply-drafting all run locally. Only the harder tasks — research that needs the open web, complex multi-step reasoning — get sent to a cloud provider, and even then often through a paid tier with explicit no-training guarantees.

The compliance story for that setup is much easier to write. Data stays on the device, the device stays in the office, and the audit trail is a series of local files instead of a series of server-side transcripts. None of this is a magic shield — endpoint security still matters, backups still need to be encrypted, and the user still needs to behave sensibly — but it’s a far easier compliance posture than “we paste client matters into ChatGPT and trust their privacy policy”.

The hardware reality: what you need to run this stuff

The practical question most people land on is whether their existing hardware can do this. The rough lines as of mid-2026:

iPhone 15 Pro or newer, plus iPad Pro M4 and any Apple Silicon Mac with 16GB RAM or more: Apple Intelligence runs locally on these for most features, with hybrid offload for the harder ones.
Pixel 8 Pro and newer, Samsung Galaxy S24 and newer on Android: Gemini Nano runs on these directly. Older or cheaper Android phones get cloud Gemini, not on-device.
Copilot+ PCs (Snapdragon X, Intel Lunar Lake, AMD Ryzen AI 300): These have the 40+ TOPS NPU required for Microsoft’s on-device features. Older Windows laptops will run smaller open-weights models in Ollama but won’t get the built-in Copilot+ experience.
Any modern Mac or PC with 16GB+ RAM and a halfway-decent GPU: You can run Llama 3 8B or Mistral Small locally through Ollama at usable speeds. 32GB lets you step up to bigger and smarter models.

The cheapest credible way to get started today is honestly a Mac mini M4 with 16GB RAM, which sits around $1,200 at JB Hi-Fi and runs local models very competently. For the Windows side, any of the Copilot+ laptops at the $1,500–$2,000 mark are excellent. The chip story behind all of this is worth understanding if you want to spend wisely — we went deep on that in our recent guide to how the productivity tablet category has evolved, which covers a fair amount of the new-NPU silicon picture.

Cloud AI still has its place

None of this is a “ditch the cloud” argument. The frontier cloud models — the latest Claude, the latest GPT, the latest Gemini Advanced — are simply smarter than anything you can run locally, and for a fair chunk of work that matters. Legal research over a complex matter, debugging a tricky codebase, writing a thoughtful long-form piece, working through an unfamiliar technical problem: cloud is still where we’d go for any of those.

The shift in 2026 is that you don’t have to use the cloud for everything. The everyday stuff — the seventeen-emails-a-day, the meeting summaries, the quick rewrites, the document scans — can all happen locally. That’s a meaningful reduction in data exposure with essentially no loss of capability for those tasks. Save the cloud for the hard stuff and run everything else on the laptop in front of you.

For context on the related security question — if your local device is doing all this work, you really do want to be confident it isn’t compromised — our piece on spotting whether your phone has been hacked covers the practical hygiene that matters even more once your device is processing sensitive prompts.

Final thoughts

Local AI has crossed from interesting to credible in 2026, and that matters for Australians in particular. The mix of stronger OAIC enforcement, growing consumer awareness of data sovereignty, and meaningful jumps in on-device capability has produced a real choice where one didn’t exist a year ago. For most everyday work, the on-device models built into your phone and laptop are now good enough — and they offer a privacy story the cloud genuinely can’t match. Save the cloud for the genuinely hard problems. Use it deliberately, knowing what you’re sending and to where. And if you’re in a small business or handling anything covered by the Privacy Act, treat the local-AI option as the default and the cloud as the considered exception. That’s the cleanest posture available right now, and it’s the one our team has quietly moved to over the past few months. Priya reckons we’re early. We don’t think we’re early enough.

Priya RamanJune 18, 2026

0 12 7 minutes read