AI SLAs and status pages: what reliability evidence vendors publish
Every major AI provider has a status page. Most claim high availability. Some offer SLAs with credits if uptime drops below a threshold.
The gap between what these pages promise and what your application actually experiences can be significant. An SLA that guarantees 99.9% API uptime does not guarantee model quality, consistent latency, or that your specific endpoint is available.
Quick answer
Provider SLAs typically cover API availability, not model quality, latency, or throughput. Status pages show what the provider chooses to report. Use both as starting points, then build your own reliability monitoring that measures what matters to your application: successful responses within your latency budget, not just HTTP 200s.
What SLAs actually cover
A typical AI API SLA covers:
API availability. Whether the API endpoint returns a valid response. Commonly 99.9% monthly uptime for standard tiers, 99.95% for enterprise.
Credits for downtime. If availability drops below the threshold, you get service credits. These are not cash refunds and usually only apply to future service usage.
What SLAs do not cover:
- Model quality or accuracy. The API can be fully available while returning incorrect or harmful outputs.
- Latency. An SLA that guarantees availability does not guarantee fast responses. A 30-second response time is technically available.
- Rate limits. Exceeding rate limits — even due to an application error — is not covered.
- Model changes. If the underlying model changes without notice and quality degrades, that is not a reliability issue under most SLAs.
- Third-party dependencies. If your application depends on a vector database or observability provider, their issues are not covered by the model provider’s SLA.
How to read status pages critically
Provider status pages differ in useful ways:
Historical data. Does the page show incident history for the past 30, 60 or 90 days? A page that only shows current status hides the pattern.
Incident detail. Are incidents described specifically (“increased latency on endpoint X between 14:00 and 15:30 UTC”) or vaguely (“intermittent errors affecting some users”)?
Granularity. Does the page report per-endpoint, per-model, per-region, or just a global status? A green global status can hide a broken endpoint.
Post-mortems. Are incident reports published with root cause analysis? Transparent providers share post-mortems; less transparent ones mark incidents resolved without explanation.
Third-party verification. Does the provider publish independent reliability data or rely solely on self-reported metrics?
What most providers publish
OpenAI. Real-time status page with historical incident data. Post-mortems published for significant incidents. SLA available on enterprise/team plans.
Anthropic. Real-time status page with incident history. Post-mortems for significant incidents. SLA through enterprise agreements.
Google (Gemini API). Status dashboard integrated with Google Cloud. Per-service and per-region granularity. Post-mortems published for major incidents.
Mistral. Status page with incident updates. SLA available through enterprise agreements.
All providers rely on self-reported metrics. None publish independent third-party reliability audits.
What to measure yourself
Provider data is useful but not sufficient. Build your own reliability monitoring:
- Successful response rate. Measure the percentage of requests that return a successful response within your latency budget. A response that takes 15 seconds is not useful even if it returns HTTP 200.
- Error rate by type. Track rate limits, server errors, timeouts, and content-filter base separately. A spike in content-filter blocks may indicate a model update, not a reliability issue.
- Latency percentiles. Track p50, p95 and p99 latency for each endpoint and model. Degradation at the tail matters more than the average.
- Quality degradation signals. Track user feedback, eval scores, and manual review flags alongside API metrics. The most common reliability failure in AI systems is not downtime but quiet quality degradation after an invisible model update.
What teams get wrong
- assuming 99.9% API availability means 99.9% of their requests succeed (errors from rate limits, latency, and content filtering are excluded);
- relying on provider status pages as their only operational monitoring;
- not distinguishing between availability and quality when setting reliability budgets;
- accepting credits as compensation for reputation damage or user churn caused by a reliability incident;
- not testing whether the SLA’s credit calculation matches their actual experience.
Practical decision check
- Does the provider’s SLA cover what your application needs (availability, latency, quality)?
- Do you have your own reliability monitoring that measures what matters to your users?
- Have you checked the provider’s historical incident data and post-mortem quality?
- Have you estimated the cost to your business of a 1-hour, 4-hour, or 24-hour provider outage?
- Do you have a fallback provider or degraded-mode plan for when the primary provider is unavailable?
If you cannot answer yes to at least three, you are less prepared for an incident than you should be.
Methodology and sources
Check date: 2026-05-24
What was checked: Published SLA documentation, status pages, incident reports, and reliability documentation from OpenAI, Anthropic, Google and Mistral. Industry service-level practices for cloud and API services.
What the sources were used for: Identifying what SLAs cover, how to evaluate status page quality, and what reliability gaps exist between provider claims and user needs.
Assumptions and limits: Contractual SLAs may differ from published information. Enterprise agreements often include terms not visible on public pages. This is operational guidance, not a contract review.
Change log
- 2026-05-24: first draft built from the llm-editor-approved brief, with a critical-evaluation framework for AI provider reliability claims.
Source list
- OpenAI status page — https://status.openai.com/
- OpenAI SLA documentation — https://openai.com/policies/service-terms/
- Anthropic status page — https://status.anthropic.com/
- Google Cloud status dashboard — https://status.cloud.google.com/
- Mistral status page — https://status.mistral.ai/