AI vendor lock-in: model APIs, embeddings, vector stores and eval data
When people talk about AI vendor lock-in, they usually mean the model API. If you switch from one provider to another, you change the model. That is the obvious layer.
The deeper problem is that lock-in happens at every layer of an AI system: the embedding provider, the vector database, the prompt formats, the evaluation data, the log schemas, and the observability setup. Each layer adds switching cost, and the sum of those costs can make leaving impractical even when the model provider is no longer the best option.
Quick answer
Model API lock-in is real but manageable. The harder lock-in comes from embeddings (which tie you to a provider’s vector space), evaluation data (which is specific to one model’s behaviour), and observability schemas (which do not export cleanly). Mitigate by using abstraction layers, portable data formats, and eval frameworks that work across providers.
The lock-in layers
Model API
The most visible layer. Each provider’s API has different endpoint formats, parameter names, streaming implementations, and error structures. Switching means rewriting integration code.
How to mitigate: Use a gateway like LiteLLM or OpenRouter that provides a unified API across providers. Test against at least two providers from the start, even if you only use one in production.
Embeddings
Embeddings from different providers sit in different vector spaces. An embedding generated by OpenAI’s text-embedding-3-small cannot be compared directly with one from Google’s text-embedding-004 or a local model like gte-large. If you have already embedded your document corpus with one provider, switching means re-embedding everything.
How to mitigate: Use a local or open-source embedding model from the start so you can run it yourself. If you must use a provider embedding, design for re-embedding — store the original text alongside the embedding vector.
Vector database
Different vector databases support different index types, query syntaxes, and metadata filters. Migrating from Pinecone to Weaviate, or from Chroma to PostgreSQL+pgvector, requires schema redesign and data migration.
How to mitigate: Use a standard index format (HNSW is widely supported) and a metadata schema that maps easily between providers. Avoid vendor-specific query features unless the performance gain is clearly worth the switching cost.
Prompt formats and system instructions
Prompts written for one provider may not work the same way with another. System prompt roles, message formatting, tool-calling schemas, and structured output specifications vary.
How to mitigate: Use a prompt abstraction format like the OpenAI chat-completion standard and test key prompts with alternative providers during development. Structure prompts in a way that separates content from formatting.
Evaluation data
Eval datasets are often built around one model’s behaviour. Your golden dataset may include examples that a different model answers differently. If you cannot regenerate or re-label your eval data for a new model, you cannot reliably compare them.
How to mitigate: Build eval data around tasks and expected outcomes, not specific model behaviour. Include clear rubrics that a human evaluator or a different model-as-judge could apply consistently.
Observability and logging
Log schemas, trace formats, and monitoring dashboards are often built for one provider’s response structure. Exporting history, running historical evals, or replaying past requests after switching providers may require schema transformation.
How to mitigate: Store raw prompts and outputs in a portable format (JSON with consistent fields). Use observability tools that support multiple providers or can ingest from a standard event format.
Eval and regression CI
CI pipelines for AI often embed assumptions about the model provider. Switching means updating test runners, assertion formats, and pass/fail criteria.
How to mitigate: Keep the evaluation layer provider-agnostic. Use evaluation frameworks that accept a standardised input/output format rather than provider-specific API wrappers.
The switching cost you should estimate
Before committing to any provider, estimate what it would cost to leave:
- Time to re-integrate. How many person-days to switch model API clients?
- Time to re-embed. How long to regenerate embeddings for the vector corpus?
- Time to re-evaluate. How long to re-run eval suites and validate output quality?
- Data that does not export. Can logs, evaluation results, and configuration be exported? In what format?
- Irrecoverable loss. Is there any data or configuration that cannot be moved at all?
If the total switching cost exceeds the value the provider delivers over its next-best alternative, you are locked in.
What teams get wrong
- assuming lock-in is only about the model API;
- building eval data that works only with one model’s behaviour;
- embedding the document corpus with a proprietary model and ignoring the re-embedding cost;
- storing logs and traces in a provider-specific schema without a portable export;
- waiting until a crisis — deprecation, pricing change, incident — to think about switching.
Practical decision check
- Can you switch model APIs with less than a week of engineering work?
- Are your embeddings portable (or can you re-embed without downtime)?
- Can your eval data be used to evaluate a different model?
- Can you export all logs, traces and configuration?
- Have you estimated the total switching cost for each layer?
If you cannot answer yes to at least three, your lock-in risk is higher than you think.
Methodology and sources
Check date: 2026-05-24
What was checked: Provider API documentation, data export capabilities, gateway/router documentation, embedding model compatibility, vector database migration guides, and evaluation framework portability.
What the sources were used for: Identifying the lock-in layers beyond model APIs and documenting practical mitigation strategies.
Assumptions and limits: Switching cost is workload-specific. The mitigation strategies described are general guidance and may not apply in all architectures.
Change log
- 2026-05-24: first draft built from the llm-editor-approved brief, with a multi-layer analysis of AI vendor lock-in.
Source list
- LiteLLM documentation — https://docs.litellm.ai/
- OpenRouter documentation — https://openrouter.ai/docs
- OpenAI embedding docs — https://platform.openai.com/docs/guides/embeddings
- Hugging Face MTEB leaderboard (embedding model comparisons) — https://huggingface.co/spaces/mteb/leaderboard
- Pinecone to Weaviate migration guides — vendor-specific
- Promptfoo (multi-provider eval framework) — https://www.promptfoo.dev/