theLLMs

Last checked: 2026-05-24

Scope: Global. Internal editorial standards, NIST AI RMF framing, and provider-source disclosure policies inform this manifesto. Enforcement is continuous.

AI draft model: deepseek-v4-flash

AI review model: llm-editor (deepseek-v4-pro)

The evidence-led AI website manifesto: how theLLMs will review claims

This site exists because most AI content on the web is marketing. Vendor blogs that benchmark against last year’s models. Thought pieces that describe what AI will do next year without evidence. How-to guides that skip the failure modes. Comparison pages that show five features but omit the one that matters.

theLLMs takes a different approach. We write for builders, buyers, and curious operators who need to make decisions — not for people who want to feel excited about AI.

Editor’s Note: If a claim on this site does not have a source, a date, or a caveat, it is probably a mistake. Tell us. We will fix it and update this version. Editor’s Note: This manifesto is itself a living document. We will update it when we find better ways to earn trust, and we will date every change.

How we review claims

Every article on this site follows a structured review process:

1. Sources must be primary or near-primary. We cite the original research paper, the provider’s official documentation, the regulatory guidance, or the benchmark data directly. We do not cite other blogs that claim to summarise the research unless the primary source is behind a paywall or unavailable.

2. Dates are mandatory. Every source includes a “checked on” date. Every claim about pricing, model capability, or provider policy includes a date. If a source is more than six months old, we say so and flag it for review.

3. Uncertainty is explicit. We do not say “models are getting cheaper” when we mean “OpenAI reduced GPT-4o pricing by 50% on 2026-02-15, but Anthropic raised Claude pricing on 2026-03-01.” We say what changed, for whom, and what is still unknown.

4. Failure modes are part of the answer. Every guide on this site includes a section on what can go wrong and where teams misuse the technique. If a technique is useful in narrow circumstances but oversold generally, we say that plainly.

5. No fake hands-on claims. If we write about a model or tool we have not tested, we say so. If we have tested it, we describe the environment, prompts, and outputs so readers can assess the evidence themselves.

What we do not do

We do not write vendor press releases. If a provider releases a new model, we may write about it — but the article will focus on what changed, what is still unknown, and what operators should verify themselves, not on why the provider thinks the model is revolutionary.

We do not make predictions without evidence. “AI will transform X industry by 2027” is not a claim we publish. “Provider Y released a model that scores Z% on benchmark B under conditions C, which suggests capability improvement in domain D” is a claim we publish, with sources and caveats.

We do not hide AI-assisted writing. Every article discloses the model that produced the initial draft and the model that performed the editorial review. Readers deserve to know how the content was created, even — especially — when the content is about AI.

We do not claim expertise we do not have. We are not lawyers, doctors, or certified financial advisors. We do not give legal, medical, or financial advice. We explain how AI systems work and what evidence exists for their claims. Decisions based on our content remain the reader’s responsibility.

How we handle corrections

If a reader reports an error:

  1. We verify the claim against the original source
  2. We update the article and add a correction notice with the date and nature of the change
  3. We review related articles for the same error
  4. We log the correction in our internal quality tracking

If a source becomes outdated:

  1. The article is flagged for review
  2. A new check-in date is assigned
  3. If the source cannot be replaced, the article carries a warning about the stale source

How we handle uncertainty

We categorise claims into four levels:

LevelMeaningExample
VerifiedSupported by multiple primary sources with consistent evidence”OpenAI released GPT-4o on 2024-05-13”
SupportedSupported by a single primary source or limited evidence”Claude 3.5 Sonnet scores higher than GPT-4o on SWE-bench Verified as of 2026-01”
ContestedDifferent sources or experts disagree”The optimal chunk size for RAG” — depends on document type, model, and task
UnknownNo reliable evidence available”When will AI surpass human performance on [specific task]” — do not guess

We label claims with their uncertainty level. We do not present contested or unknown claims as verified.

What changes would update this policy

  • We discover we have been applying these standards inconsistently
  • New regulatory guidance changes what constitutes responsible AI publishing
  • Readers provide feedback that changes our understanding of what is trustworthy
  • We find a better way to present uncertainty and evidence

Methodology and sources

This manifesto draws on NIST AI RMF governance practices, journalistic source-verification standards adapted for AI-specific claims, and operational experience from running this site.

Change log

  • 2026-05-24 — First published version.

Source list