/images/blog-generated/vendor-lock-in-is-back-ai-shaped.webp

Vendor Lock-In Is Back. It Is Just AI-Shaped Now.

For about a decade, the conventional wisdom on vendor lock-in was that the industry had mostly solved it. Containers, Kubernetes, open APIs, multi-cloud architectures, infrastructure-as-code, and standardized data formats had collectively pushed lock-in down to a low background level. Switching costs were not zero, but they were bounded, planned for, and within the negotiating leverage of most enterprise buyers.

The conversation got quieter. The CIO’s vendor-lock-in nightmare faded. The procurement teams updated their templates. Lock-in became a checkbox: “we use standard interfaces, we are portable, we are fine.”

We are not fine. The 2026 AI stack has reintroduced vendor lock-in at depths the industry has not seen since the mainframe era, and most enterprise buyers are signing into it without recognizing it as lock-in at all.

Where Lock-In Hides Now

The lock-in is no longer concentrated in the obvious places. It is not in the database vendor, the cloud provider, the orchestration platform. Those battles have been fought, and the standards that emerged from them are still doing their job. The lock-in has moved upward, into the parts of the stack that nobody yet has standards for.

Model selection. A foundation model’s output behavior is not portable. Two models from two providers, given the same prompt, produce different outputs. Sometimes the differences are small. Sometimes they are large enough that the downstream code – the parsing, the validation, the orchestration logic – breaks completely when you swap one for the other. Production systems get tuned, often invisibly, to the specific behaviors of the model in use. Switching models means re-tuning. Re-tuning means re-validating. Re-validating against the entire production surface is a multi-quarter project that nobody planned for.

Tool surfaces. Every model provider has its own conventions for tool calling. Their SDKs assume their conventions. Their documentation assumes their conventions. The protocols that aspire to be cross-provider – MCP among them – are not yet stable enough to insulate you from these conventions. A system built natively against one provider’s tool calling has hundreds of small incompatibilities baked into it that only become visible when you try to switch.

Context management. Each provider has different rules about caching, context windows, system-prompt structure, and message-format expectations. Production systems learn to play those rules. The performance characteristics – cost per million tokens, latency under load, cache hit rates – depend on those rules. A system optimized for one provider’s context model performs differently under another, sometimes by factors of three or more in cost. The switch is technically possible. It is economically a different product.

Evaluation pipelines. Production AI systems live or die by their evaluation harnesses. The harnesses are built around the models they are evaluating. Switching models breaks the harnesses. Rebuilding the harnesses costs as much as the original build, and the new harness’s results are not comparable to the old one’s results. You lose the longitudinal view of system health. Most teams cannot afford that loss.

Knowledge bases and retrieval indices. Embedding models are not interchangeable. An index built with provider A’s embeddings does not serve provider B’s queries. Re-embedding a multi-million-document corpus is expensive in compute, time, and validation. Most large indices, once built against a specific embedding model, are de facto locked to that model and the provider that ships it.

Add these together and the AI stack has more vendor-specific surface area than most enterprise stacks of the last fifteen years.

Why It Does Not Feel Like Lock-In

The lock-in does not feel like lock-in because the contracts do not look like lock-in. Most foundation model providers sell on usage-based pricing, with no minimum commitments, no multi-year terms, no penalty for switching. By the procurement team’s standard checklist, the contract is portable. The procurement team is correct about the contract. They are wrong about the system the contract supports.

The procurement-level analysis assumes that “switching” means “stopping payments to vendor A and starting payments to vendor B.” In the AI context, switching also means: re-tuning prompts against the new model’s quirks, re-validating evaluation harnesses, re-building retrieval indices, re-instrumenting cost dashboards, re-training the team on the new SDK, re-pricing the workload at the new cost-per-token structure, and re-launching the entire user-experience under whatever subtle behavior changes the new model introduces.

The cost of those activities is not on any contract. It is on the engineering organization’s quarterly budget. Procurement does not see it. The system architect does. The conversation between them is usually not happening at all, because the procurement team’s checklist says lock-in is solved.

How Bad Is It

In our work with mid-to-large engineering orgs, we have started doing a “switching cost audit” as part of any AI architecture review. The audit asks: if we needed to switch our primary foundation model provider in 30 days, what would it cost?

Typical answers, by org size and AI maturity:

A team running a single agent against one model, lightly tuned: two to four weeks of engineering work, and they would lose roughly a month of operational stability while the new system settled.
A team running multiple agents with custom prompts, evaluation harnesses, and a few tool integrations: two to four months of engineering work, plus a substantial dollar cost in re-validation.
A team with a production retrieval-augmented system, multiple specialized agents, and a mature evaluation pipeline: six to twelve months of work, with no realistic path to a clean 30-day switch.
A team with deeply integrated AI workflows across multiple business processes, custom fine-tuned models, and a multi-million-document retrieval index: practically locked. Switching is technically possible but would require a re-platforming project of the kind most orgs only attempt once a decade.

None of these teams thought they were locked in when they started. All of them are. The lock-in accrued silently, decision by decision, none of which felt load-bearing at the time.

What Mitigation Actually Looks Like

There is no clean abstraction layer that eliminates AI vendor lock-in. The systems are too new and the standards too immature. What there is, is a set of architectural disciplines that bound the lock-in to a level where switching is uncomfortable but possible.

Treat the model as a tier, not a constant. Architect the system so the model is one tier among several, with a defined contract to the layers above and below it. The contract is yours. The model is the implementation. Switching the implementation is a real event, but it is a bounded one.

Maintain shadow infrastructure for a second provider. Run a small portion of production traffic, or a representative test set, against a second model provider on a continuous basis. The cost is non-trivial. The benefit is that you always know what switching would cost, because you are already partially switched. This is the discipline most teams find too expensive to maintain, and the discipline most likely to save them when they need it.

Own your evaluation harness, not the vendor’s. A vendor-provided evaluation pipeline is a lock-in vector. Build your own. Run it against multiple models continuously. Your evaluation results, in your harness, against your test set, are your portable asset.

Embed conservatively. When choosing an embedding model for retrieval, default to models that are widely supported across providers. The performance tradeoffs are usually small; the portability benefit is large. Locking your retrieval index to a single provider’s proprietary embeddings is an expensive decision dressed up as a performance optimization.

Negotiate against the real switching cost, not the contract. When you renew or expand with a foundation model provider, the negotiation should reflect what it would actually cost you to switch. If switching would take you six months and a million dollars, the vendor knows that. The pricing they offer reflects it. The buyer who pretends switching is free is negotiating against an empty hand.

The Honest Statement

Vendor lock-in did not go away. It moved. The standards that solved lock-in at the infrastructure layer have not yet emerged at the AI layer, and may not for several more years. In the meantime, the orgs that pretend lock-in is solved are accruing switching costs they will eventually have to pay – either when a vendor relationship sours, when a pricing model changes, or when a better model appears that they cannot adopt.

Acknowledge the lock-in. Architect against it. Negotiate from honesty about it. The procurement-level “portability checkmark” is a comforting fiction. The systems are not portable yet. Plan accordingly.

posted by admin

Jun 29, 2026 - 7