theLLMs

Last checked: 2026-05-22

Scope: Global. Provider documentation and schema-validation references were checked on 2026-05-22. No GB/NI split applies.

AI draft model: gpt-5.4-mini

AI review model: llm-editor (deepseek-v4-pro)

Structured outputs and JSON mode: reliability limits

If you need machine-readable output from an LLM, JSON mode and structured outputs are useful tools — but they are not the same thing as a trustworthy workflow.

The safe short answer is this: JSON mode helps with syntax; structured outputs help with schema adherence; neither one guarantees that the model is correct, current, complete, or safe to act on. If your downstream system cares about business rules, freshness, permissions, or money, you still need validation after the model responds.

That is the part teams miss when they relax after the first successful parse. The output can be valid JSON and still be the wrong answer for your application.

Trust stack

AI draft model: gpt-5.4-mini. AI review model: gpt-5.4. Checked against current provider documentation on 2026-05-22.

Quick answer

Use JSON mode or structured outputs when you need a response that your code can parse reliably. Use structured outputs when you also need the model to match a JSON Schema you define. Use JSON mode only when you can tolerate a weaker guarantee and you will still validate the result after parsing.

Do not treat either feature as a guarantee of truth. A model can return valid JSON, satisfy your schema, and still produce a bad field value, a stale decision, or an unsafe action. That is why the real control point is the code that runs after parsing: schema validation, business-rule checks, permission checks, and fallback handling.

If the output can trigger a refund, write to a database, send a message, or change state, add a second gate. Otherwise you are trusting formatting to do the job of judgement.

Editor’s Note: A successful parse feels like progress, so teams stop looking too early. The bug often survives, just tidier.

Editor’s Note: Parser success is not the same as workflow success. A response can be syntactically clean and still be the wrong thing to do.

Editor’s Note: If the downstream system is brittle, structured output will not save you. It will only make the failure look more respectable.

What JSON mode and structured outputs actually do

The provider docs checked on 2026-05-22 point to a simple pattern:

  • JSON mode asks the model to return a valid JSON object.
  • Structured outputs add a JSON Schema constraint so the response is shaped to the schema you provide.
  • Tool use / function calling can carry structured arguments or actions, but it still depends on the code that executes the tool result and validates the outcome.

Current provider-doc snapshot

SourceWhat the current docs sayPractical takeaway
OpenAI structured model outputs docsJSON mode is the simpler feature; structured outputs are the stronger version and schema adherence is the point.If you need schema guarantees, do not stop at JSON mode.
Azure OpenAI JSON mode docsJSON mode returns a valid JSON object, but it does not guarantee a specific schema.JSON mode is a syntax guard, not a full contract.
Azure OpenAI structured outputs docsStructured outputs follow a JSON Schema supplied in the API call and contrast with older JSON mode.Schema-driven output is the better choice when downstream code depends on field shape.
Anthropic tool use overviewTool use is about coordinating model and application tools, not magically fixing downstream correctness.Tool orchestration still needs validation and business rules after the model call.
JSON Schema referenceA schema is the contract your validator can check against.Schema validation is where you catch shape errors after generation.

In plain English: structured outputs are better than JSON mode when you need a contract, but neither one is a substitute for application logic.

What they do not guarantee

Neither JSON mode nor structured outputs guarantee:

  • that the answer is factually correct;
  • that the answer is up to date;
  • that the answer follows your business rules;
  • that the answer is permitted for the current user;
  • that the answer is safe to execute;
  • that the answer is the best option among several valid choices;
  • that the answer will stay correct after you post-process it.

That is why the right mental model is layered reliability, not “the model solved it.”

Why valid JSON can still break your workflow

Here is the failure ladder teams usually run into.

Failure modeWhat passesWhat still fails
Malformed outputNothingThe parser cannot read it
Valid JSON with a missing required fieldBasic syntaxSchema validation
Schema-compliant output with the wrong field valueSyntax and schemaBusiness rules
Valid JSON with confident nonsenseSyntax and schemaDomain checks, human judgement, external verification
Parser success with a dangerous actionParsing and maybe schema checksPermissions, approval gates, side-effect control

The nasty version is the last one. Your code can accept the response, your parser can smile, and the system can still do the wrong thing.

Worked example: valid JSON, wrong decision

Illustrative example only — this is not from a live model run:

{
  "status": "approved",
  "refund_amount_gbp": 0,
  "evidence": "customer asked for it",
  "next_step": "send refund now"
}

This is valid JSON. If your schema is too loose, it may even pass validation. But it is operationally wrong because the field values do not satisfy the real business rule: zero refund does not support a refund action, and “customer asked for it” is not evidence by itself.

That is the central point of this page. Format compliance is only the first gate.

Where schema validation helps

Schema validation is useful because it catches structure errors after the model responds. That means you can reject:

  • missing fields;
  • wrong types;
  • extra fields you do not allow;
  • invalid enums;
  • malformed nested objects.

But schema validation is still only one layer. It does not know whether the model is lying, guessing, hallucinating, or using stale context. For that you need additional checks.

Minimum safe production checklist

  • Parse the response with a real JSON parser, not a string split.
  • Validate the parsed object against a schema.
  • Check business rules that the schema cannot express.
  • Enforce permissions before any side effect.
  • Add explicit rejection paths for missing evidence, stale values, or unsafe actions.
  • Log rejected outputs so you can improve prompts, schemas, and tests.
  • Cap retries so one bad call does not become a loop.
  • Test with deliberately awkward inputs, not just happy-path prompts.

Editor’s Note: The hard bug is often not the model output. It is the assumption that “parseable” means “safe to use.”

Validation flow that holds up better

A safer production sequence looks like this:

  1. Ask the model for a structured response.
  2. Parse the response.
  3. Validate it against a schema.
  4. Apply business-rule checks.
  5. Apply permission or approval gates.
  6. Only then allow any side effect.

If any of those steps fails, the system should stop cleanly and ask for a retry, a human review, or a different input.

Tool use and function calling are not a free pass

Tool use and function calling are helpful because they move some structure into a machine-readable envelope. That still does not make the model right.

A tool call can be syntactically perfect and still be the wrong action for the current user, the current state, or the current policy. So the same rule applies: validate the arguments, check the business conditions, and only then execute the tool.

For production systems, the tool call is not the finish line. It is the point where the real checks begin.

Global applicability

This article is global. There is no GB / NI split to apply here.

The same caution applies in every market: if a model response can affect money, state, access, or a user-visible decision, do not trust format alone.

Methodology

Check date: 2026-05-22

What was checked:

  • OpenAI developers documentation for structured model outputs.
  • Azure OpenAI documentation for JSON mode.
  • Azure OpenAI documentation for structured outputs.
  • Anthropic documentation for tool use.
  • JSON Schema reference documentation for schema-validation concepts.

What the docs were used to verify:

  • JSON mode targets valid JSON but does not guarantee a specific schema.
  • Structured outputs add JSON Schema adherence on top of valid JSON output.
  • Tool use/function calling still needs downstream validation and business-rule control.
  • JSON Schema validation is the post-generation check that catches shape problems.

Assumptions and limits:

  • All examples in this article are illustrative unless explicitly sourced.
  • No local implementation test was run for this draft, so there are no invented claims about runtime behaviour.
  • This page focuses on reliability limits, not provider ranking or model benchmarking.
  • No formula is needed for this article; the useful control is the validation sequence, not arithmetic.

Change log

  • 2026-05-22: first draft built from the llm-editor-approved brief, using current provider-doc checks, a failure-ladder table, a validation checklist, and explicit caveats about syntax versus correctness.

Source list