Essay

AI just became a single point of failure — the fix is local

Last week the U.S. government switched off the most powerful AI model on Earth — for everyone, overnight. Here’s why that’s the strongest argument yet for running AI you actually own.

By Jasvant Singh Dosanjh · June 13, 2026

The overnight shutdown

On June 12–13, 2026, the U.S. Commerce Department ordered Anthropic to suspend access to its two most capable models — Fable 5 and Mythos 5 — for all foreign nationals, citing national security (reportedly tied to a jailbreaking method). Anthropic couldn’t restrict selectively; to comply, it had to disable the models for every customer worldwide. It was the first time the U.S. government had switched off a commercial AI model globally. People who had built on those models — and paid for them — woke up to a product that no longer existed, with refunds to follow.

Set aside whether the order was right. The mechanism is the lesson: the most powerful AI on the planet was revoked for everyone, with effectively no notice, by a party that was neither the buyer nor the vendor.

The pattern: three ways your AI dependency fails

I work in security and compliance, and this is a textbook concentration-risk event. If your product, workflow, or business runs on a remote frontier model, you carry three live risks:

Priced out. Flagship tiers now sit at $200/month (Google’s top plan, ChatGPT Pro), and the most capable features increasingly live behind enterprise contracts and rate limits. For most small businesses, the best AI is drifting out of reach on cost alone.
Locked out. Access already varies by geography and citizenship — whole regions and user categories can be in or out depending on policy or export rules.
Shut off. As Fable just showed, a single government letter can take a model offline for everyone, instantly.

Any one of these belongs in a risk register. All three on the same dependency is not a foundation you can build a compliance program — or a company — on.

Meanwhile, capable AI quietly moved onto your device

While frontier access gets more concentrated and conditional, high-quality AI got small, cheap, and open enough to run on hardware you already own.

Open weights went credible. This month, JetBrains open-sourced Mellum2 — a 12B mixture-of-experts coding model — under the permissive Apache 2.0 license, free to self-host. It joins a deep bench: Qwen3, Google’s Gemma 3 (in sizes that run on a laptop, with long context and multimodal input), Meta’s Llama 3.3, and DeepSeek. For a large share of real tasks, “run it yourself” no longer means a meaningful quality drop.
The plumbing got trivial. Ollama, LM Studio, and llama.cpp — plus Apple’s MLX on Apple Silicon — turned self-hosting into a one-line install. A modern laptop, with 4-bit quantization, runs a genuinely capable model offline, with no API key and no per-token meter.
Apple made local-first mainstream — with a telling caveat. Apple Intelligence already runs on-device foundation models across roughly a billion devices, and those models are open to developers through its Foundation Models framework. Yet for its biggest swing — the rebuilt Siri unveiled at WWDC 2026 and shipping in iOS 27 this fall (opt-in beta) — Apple reached for a large custom cloud model to handle the hardest reasoning. That’s not a contradiction; it’s the thesis in miniature: local for the everyday, cloud for the heavy lifting.

Why this is a resilience story, not just a privacy one

The usual case for local AI is privacy and cost. Both are real. But the Fable shutdown reframes it as something a security or GRC leader cares about more: resilience and control. When the model runs on your device or your own infrastructure:

The data never leaves it. Privacy, data residency, and a large class of compliance obligations are solved by architecture — not by a vendor’s terms of service or a DPA you have to trust. A clinic can run a model over patient data; a defense supplier can use AI without controlled information leaving the building.
It can’t be revoked out from under you. No letter, price hike, region block, or deprecation notice takes it away mid-project. Your baseline capability is yours.
Vendor concentration risk drops. You’re no longer one policy change away from an outage you can’t fix.

This is the same principle I design my own products around: the assessment tools I build keep regulated data entirely on the user’s machine, precisely so nothing can be exfiltrated — or switched off.

What used to be a niche requirement for regulated industries is becoming a mainstream expectation.

What I’d actually do

This isn’t “abandon the cloud.” Frontier models are extraordinary, and for genuinely massive jobs you’ll still reach for one. The move is to stop treating a remote model as your only option:

Individuals and small teams: install Ollama, pull a model like Gemma 3 or Qwen3, and run your everyday tasks locally. Keep a cloud subscription for the heavy lifting, not the basics.
Regulated organizations: put “model-access revocation / region lock” in your risk register alongside your other third-party dependencies, and stand up a local/open-weight option for anything touching sensitive data. Treat it as business continuity.
Builders: design so your product degrades gracefully — a local model as the floor, a cloud model as the ceiling. If your roadmap assumes a specific remote model will always be available to everyone, last week showed why that’s fragile.

The bet

The era where the best AI meant the biggest monthly bill — or the right passport — is ending. What replaces it looks more like how computing is supposed to work: capable, private, resilient, and yours.

Local by default. Cloud for the heavy lifting. But own your baseline.

The companies that win the next phase won’t be whoever has the largest model. They’ll be whoever makes a good model effortless to run on infrastructure you control, with your data staying exactly where you put it.

— Jasvant Singh Dosanjh. I build local-first, privacy-respecting security & compliance software at Dosanjh Labs.