The evidence-led AI website manifesto: how theLLMs will review claims

This site exists because most AI content on the web is marketing. Vendor blogs that benchmark against last year’s models. Thought pieces that describe what AI will do next year without evidence. How-to guides that skip the failure modes. Comparison pages that show five features but omit the one that matters.

theLLMs takes a different approach. We write for builders, buyers, and curious operators who need to make decisions — not for people who want to feel excited about AI.

TL;DR

Every article on theLLMs must cite primary sources with dates, label uncertainty explicitly, disclose AI-assisted writing, and include failure modes alongside the solution. If a claim lacks a source, a date, or a caveat, that is a mistake — and we fix it publicly. This manifesto is itself a living document: every change is dated and logged.

How we review claims

Every article on this site follows a structured review process:

1. Sources must be primary or near-primary. We cite the original research paper, the provider’s official documentation, the regulatory guidance, or the benchmark data directly. We do not cite other blogs that claim to summarise the research unless the primary source is behind a paywall or unavailable.

2. Dates are mandatory. Every source includes a “checked on” date. Every claim about pricing, model capability, or provider policy includes a date. If a source is more than six months old, we say so and flag it for review.

3. Uncertainty is explicit. We do not say “models are getting cheaper” when we mean “OpenAI reduced GPT-4o pricing by 50% on 2026-02-15, but Anthropic raised Claude pricing on 2026-03-01.” We say what changed, for whom, and what is still unknown.

4. Failure modes are part of the answer. Every guide on this site includes a section on what can go wrong and where teams misuse the technique. If a technique is useful in narrow circumstances but oversold generally, we say that plainly.

5. No fake hands-on claims. If we write about a model or tool we have not tested, we say so. If we have tested it, we describe the environment, prompts, and outputs so readers can assess the evidence themselves.

What we do not do

We do not write vendor press releases. If a provider releases a new model, we may write about it — but the article will focus on what changed, what is still unknown, and what operators should verify themselves, not on why the provider thinks the model is revolutionary.

We do not make predictions without evidence. “AI will transform X industry by 2027” is not a claim we publish. “Provider Y released a model that scores Z% on benchmark B under conditions C, which suggests capability improvement in domain D” is a claim we publish, with sources and caveats.

We do not hide AI-assisted writing. Every article discloses the model that produced the initial draft and the model that performed the editorial review. Readers deserve to know how the content was created, even — especially — when the content is about AI.

We do not claim expertise we do not have. We are not lawyers, doctors, or certified financial advisors. We do not give legal, medical, or financial advice. We explain how AI systems work and what evidence exists for their claims. Decisions based on our content remain the reader’s responsibility.

How we handle corrections

If a reader reports an error:

We verify the claim against the original source
We update the article and add a correction notice with the date and nature of the change
We review related articles for the same error
We log the correction in our internal quality tracking

If a source becomes outdated:

The article is flagged for review
A new check-in date is assigned
If the source cannot be replaced, the article carries a warning about the stale source

How we handle uncertainty

We categorise claims into four levels:

Level	Meaning	Example
Verified	Supported by multiple primary sources with consistent evidence	”OpenAI released GPT-4o on 2024-05-13”
Supported	Supported by a single primary source or limited evidence	”Claude 3.5 Sonnet scores higher than GPT-4o on SWE-bench Verified as of 2026-01”
Contested	Different sources or experts disagree	”The optimal chunk size for RAG” — depends on document type, model, and task
Unknown	No reliable evidence available	”When will AI surpass human performance on [specific task]” — do not guess

We label claims with their uncertainty level. We do not present contested or unknown claims as verified.

What changes would update this policy

We discover we have been applying these standards inconsistently
New regulatory guidance changes what constitutes responsible AI publishing
Readers provide feedback that changes our understanding of what is trustworthy
We find a better way to present uncertainty and evidence

Methodology

Data checked: 2026-05-28
Sources consulted: NIST AI RMF governance practices, ICO guidance on AI transparency, journalistic source-verification standards adapted for AI-specific claims, operational experience from running this site
Assumptions: This manifesto describes editorial standards as currently implemented. Gaps between stated and actual practice are corrected when discovered. The standards assume English-language publishing on a single domain.
Limitations: This manifesto covers editorial policy only. It does not address technical implementation of the site, SEO strategy, or business model. It is not a legal compliance document — for regulatory obligations, consult the relevant jurisdiction’s guidance directly.
Jurisdiction: Global. References UK ICO guidance as an example of transparency standards but the principles apply across jurisdictions.

Source list

[1] NIST AI Risk Management Framework — https://www.nist.gov/itl/ai-risk-management-framework (accessed 2026-05-28)
[2] ICO guidance on AI transparency — https://ico.org.uk/for-organisations/ai-and-data-protection/ (accessed 2026-05-28)
[3] OWASP LLM Top 10 — https://owasp.org/www-project-top-10-for-large-language-model-applications/ (accessed 2026-05-28)

Trust Stack

Last checked: 2026-05-28
Corrections: Contact us to report errors

Change log

2026-05-28: editorial review — added Quick Answer section, converted Editor’s Notes to proper <aside> format (3 cards), added Trust Stack, expanded Methodology and Source list, added slugified heading IDs, corrected writtenBy to “llm-author”
2026-05-24: first published