theLLMs Run

37published guides

Published now

Live guides in this prototype

Google's New SDLC for Vibe Coding — The Missing Guide

Google engineers formalized a six-phase framework for AI-assisted development that maps the spectrum from casual prompt-

Run · 2026-06-28

Google Releases Gemini 3.5 Live Translate — Real-Time Voice Preservation Translation

Google DeepMind's Gemini 3.5 Live Translate translates spoken audio between 70+ languages in real time while preserving

Run · 2026-06-28

Fine-Tuning LLMs: A Practical Step-by-Step Guide

A practical walkthrough of the full fine-tuning lifecycle — from choosing your method to deploying a working model you c

Run · 2026-06-26

Multimodal LLMs for production: when to use image, audio and video capabilities

A practical decision framework for adding visual, audio and video inputs to LLM products — covering costs, latency, accu

Run · 2026-06-24

Prompt Injection Explained for Business Users

A practical security guide to prompt injection — how attackers hijack AI models, what business users need to know, and t

Run · 2026-06-24

Building a custom LLM chatbot: end-to-end guide from specification to deployment

A staged guide for teams building their own LLM chatbot: define scope, choose architecture, implement, test, and deploy

Run · 2026-06-18

LLM monitoring and alerting: tracking performance, cost, and reliability in production

Build practical alerting for LLM apps — four failure modes, tiered thresholds, rolling baselines, and how to avoid alert

Run · 2026-06-09

Multi-model routing and gateway architecture — routing queries to the right model based on cost, latency, and capability

A practical framework for multi-model routing: how to design gateways, routing policies, and fallback chains that optimi

Run · 2026-06-07

LLMs for software engineering and DevOps: a practical guide

How LLMs fit into code review, test generation, incident response, and DevOps workflows — what works, what doesn't, and

Run · 2026-06-06

LLM security hardening for production — input validation, output filtering, rate limiting and PII scanning

A practical guide to the four security layers every production LLM application needs: input validation, output filtering

Run · 2026-06-03

LLM cost optimisation playbook: practical techniques for reducing API spend

A structured framework for diagnosing and reducing LLM API costs: prompt optimisation, caching, output control, model ro

Run · 2026-05-31

Fine-tuning LLMs: a practical step-by-step guide

From data prep to deployment: how to fine-tune an LLM with a concrete worked example, covering LoRA setup, training conf

Run · 2026-05-30

LLM observability stack: logging, tracing, monitoring, and cost tracking

Compare five LLM observability tools—LangSmith, Arize, Helicone, Weights & Biases, and Datadog—with setup guidance and a

Run · 2026-05-30

MCP implementer's guide: setting up Model Context Protocol servers, tools, and security patterns

A step-by-step tutorial for building MCP servers in Python and Node, exposing tools and resources, connecting to desktop

Run · 2026-05-30

LLMs for startups vs enterprises: when to build, when to buy

Decision framework for API-based vs self-hosted LLM: when team size, budget, latency and data sensitivity push you towar

Run · 2026-05-30

Tool use and function calling: a practical implementation guide

A step-by-step guide to implementing LLM function calling in production: defining schemas, handling parallel calls, erro

Run · 2026-05-29

Golden datasets for LLM products: how small regression sets prevent regressions

A practical explanation of why a small, stable test set is often more useful than a huge benchmark when you need confide

Run · 2026-05-28

The model release treadmill: how to avoid rebuilding every month

A practical guide to decoupling your product from provider churn so every model update does not become a rewrite.

Run · 2026-05-28

Vector databases: when semantic search is enough and when it is not

A practical guide to deciding whether you need a vector database, a search index or something much simpler.

Run · 2026-05-28

Schema-first AI extraction: making LLMs useful for messy documents

How to extract structured data from unstructured documents using LLMs: schema design, confidence flags, validation, and

Run · 2026-05-28

Prompt versioning: treating prompts like production code

How to manage prompt changes in teams: version control, eval-linked releases, approval workflows, rollback strategies, a

Run · 2026-05-28

MCP explained: tools, resources, prompts and the current hype gap

A clear guide to what Model Context Protocol is, what it is not, and why marketing sometimes runs ahead of the wiring.

Run · 2026-05-28

Tool-use safety: stopping agents from taking dangerous actions

A practical guide to approval gates, least privilege, dry runs and audit logs for AI agents with tools.

Run · 2026-05-28

The evidence-led AI website manifesto: how theLLMs will review claims

How theLLMs reviews claims, sources content, dates evidence, and handles uncertainty. A public editorial standard for tr

Run · 2026-05-28

Safe prompt templates: reducing brittle instructions and hidden assumptions

Why prompts fail silently, how to treat prompts as tested product assets, and the versioning, testing and acceptance cri

Run · 2026-05-28

Refusals and over-refusals: testing whether safety blocks useful work

A plain-English guide to distinguishing sensible safety boundaries from over-refusal that breaks legitimate use cases.

Run · 2026-05-28

Red teaming an LLM feature: a practical first-week checklist

A practical first-week checklist for finding failure modes in a new LLM feature — with concrete test items, sample promp

Run · 2026-05-28

PII handling for LLM apps: minimisation before redaction

A practical guide to reducing personal-data exposure in AI features by minimising what you send before you try to redact

Run · 2026-05-28

Model drift without training: why API behavior changes over time

Why hosted LLMs change their outputs even without a version bump, and how pinned models, eval regression sets, and chang

Run · 2026-05-28

Rerankers explained: the quiet quality layer in RAG systems

How rerankers improve retrieval precision with a worked example, model names, latency numbers, and a decision framework

Run · 2026-05-28

LLM observability basics: traces, prompts, evals and feedback loops

A framework for monitoring LLM applications in production: what to trace, which metrics matter, how to sample prompts, a

Run · 2026-05-28

- "Access- control- for- RAG:- why- retrieval- permissions- matter- before- generation

- "Retrieval- permissions- must- be- enforced- before- generation,- not- after.- Learn- where- teams- get- document-leve

Run · - "2026-05-28

- "Human-in-the-loop- AI:- approval- queues- that- do- not- become- bottlenecks

- "How- to- design- human- review- for- AI- outputs- that- catches- real- failures- without- slowing- down- every- routi

Run · - "2026-05-28

- "Jailbreaks- vs- product- safety:- what- operators- can- realistically- control

- "A- practical- guide- to- separating- model-level- safety- from- app-level- permissions,- tool- boundaries- and- opera

Run · - "2026-05-28

- "Latency- in- LLM- apps:- first- token,- total- time- and- user- experience

- "A- plain-English- guide- to- why- AI- features- feel- slow,- what- to- measure,- and- how- to- separate- queueing,- m

Run · - "2026-05-28

- "Hallucination- testing:- how- to- build- a- small- regression- set

- "A- practical- guide- to- creating- a- small- test- set- for- unsupported- claims,- regression- checking- and- safer-

Run · - "2026-05-28

- "Inference- vs- training- vs- fine-tuning:- three- terms- operators- confuse

- "A- plain-English- guide- to- the- three- phases- of- model- work,- what- each- one- changes,- and- what- the- differe

Run · - "2026-05-28

Editorial rule

Every guide needs a stopping point

Define the job, input, output, assumptions, and failure modes.
Show the checks that prove the workflow worked, or say what remains unproved.
Link back to Cache for concepts instead of re-explaining everything on every page.
Keep tool/provider claims dated. LLM tooling ages like milk in a warm server rack.

Workflows you can actually test

Live guides in this prototype

Every guide needs a stopping point