theLLMs

Last checked: 2026-05-24

Scope: Global. Provider policies checked on 2026-05-24; terms vary by plan, region and contract.

AI draft model: deepseek-v4-flash

AI review model: llm-editor (deepseek-v4-pro)

Provider data retention policies: what API users should compare

When you send a prompt to an AI API, you are sending data to someone else’s servers. What happens to that data after the response comes back depends on the provider, the plan, the region, and sometimes the specific endpoint.

The differences matter. Some providers use API inputs for training. Some retain logs for 30 days. Some let you delete them immediately. Some route data through specific jurisdictions. These are not small details — they are fundamental to whether an API is safe for your use case.

Quick answer

Compare providers on five dimensions: training data use, log retention period, abuse monitoring scope, region controls, and deletion capability. If a provider is unclear on any of these, that is a risk. If your data is sensitive or regulated, the answer to all five questions should be documented before you send a production request.

What to compare

Training data use

The most important question: does the provider train on API inputs and outputs?

Policies fall into three categories:

  • Opt-out by default. Your data is not used for training unless you explicitly consent. This is the safest option for sensitive data.
  • Opt-in by default. Your data may be used for training unless you opt out. The opt-out is usually in the API settings or account console.
  • Not for training. The provider commits not to use API data for model training at all, regardless of settings.

The distinction between “we may use data to improve our services” and “we may use data to train models” is important. Some providers separate the two; others do not.

Log retention

How long does the provider keep prompts, outputs, and request metadata?

Typical ranges:

  • 30 days — common for default consumer and developer plans.
  • Zero retention — available on some enterprise plans or with specific data-processing agreements.
  • Indefinite — your data is stored until you delete it or your account is closed.

Short retention is generally better for privacy. But short retention also means you cannot retrieve logs for debugging after the period expires, which creates a tension between privacy and operability.

Abuse monitoring

Providers monitor API traffic for abuse. The question is whether that monitoring involves human review of your prompts and outputs.

  • Automated monitoring only. Behavioural patterns and metadata are checked. Content is reviewed programmatically.
  • Human review. Prompts flagged by automated systems may be reviewed by humans. This is common for content-safety systems.
  • No monitoring. Rare and usually only available on enterprise plans with contractual guarantees.

Automated monitoring is generally low risk. Human review means someone may read your prompts — important to know if you are sending proprietary data.

Region controls

Where is your data processed and stored?

  • Single region. All data stays in a specified geographic region (e.g., US, EU, UK).
  • Default routing. Data may be processed in any region where the provider has infrastructure.
  • No guarantee. The provider does not specify where data is processed.

For organisations subject to GDPR, UK GDPR, or other data-localisation requirements, the region control question is often non-negotiable.

Deletion and export

Can you delete your data on demand? Can you export it?

  • API deletion. Can you delete prompts, logs and outputs via API or console?
  • Bulk deletion. Can you delete all data at once, or only item by item?
  • Export format. Can you get your data out in a usable format?
  • Deletion confirmation. Does the provider confirm deletion, or do you have to trust the policy?

What the policies typically look like by provider

This is a general characterisation, not legal advice. Always check the current terms for your specific plan.

OpenAI. Default API traffic is not used for training. Logs retained for 30 days by default. Zero-data-retention option available on API with verified business use. Region controls available on enterprise plans.

Anthropic. API data is not used for training by default. Logs retained for 30 days on default plan. Extended retention and zero-retention options available. Region controls through enterprise agreements.

Google (Gemini API). API data not used for training by default. Log retention period varies by plan. Region controls through Google Cloud project configuration.

Mistral. API data not used for training by default. Retention details vary. Enterprise options for data processing agreements.

All of these policies change with plan level and contract terms. The published policy may not match what is in your signed agreement.

What teams get wrong

  1. assuming the published policy matches the actual contract terms;
  2. not checking whether the API plan they are on is the same as the plan described in the privacy policy;
  3. ignoring abuse monitoring because it seems unlikely to affect them — human review is more common than most developers realise;
  4. confusing “data not used for training” with “data not stored at all”;
  5. forgetting that prompts containing customer data give the provider access to that data, regardless of whether it is used for training.

Practical decision check

  • Does the provider use your API data for training?
  • How long are logs retained?
  • Is there human review of content for abuse monitoring?
  • Can you restrict data processing to a specific region?
  • Can you delete and export your data on demand?

If you cannot get clear, documented answers to all five, the provider’s data governance is not transparent enough for a serious evaluation.

Methodology and sources

Check date: 2026-05-24

What was checked: Public privacy policies, data processing agreements, API documentation and enterprise data sheets from OpenAI, Anthropic, Google and Mistral.

What the sources were used for: Building the five-dimension comparison framework and characterising typical policy patterns by provider category.

Assumptions and limits: Provider policies change. Enterprise contracts differ from published terms. This is a comparison framework, not a legal analysis of any specific provider’s policy.

Change log

  • 2026-05-24: first draft built from the llm-editor-approved brief, with a neutral comparison checklist for AI API data governance.

Source list