PII handling for LLM apps: minimisation before redaction
PII handling goes wrong when teams start with redaction instead of scope. If you can avoid sending a name, address, account number or support transcript, you have already removed more risk than any after-the-fact masking can recover.
The first question is not “how do we redact this?” It is “do we need to send this at all?” If the answer is yes, then minimise, classify, segment and retain for as little time as the use case allows.
This is not legal advice. Privacy duties depend on the exact data, processing purpose, region, vendor terms and retention model.
Trust stack
AI draft model: gpt-5.4-mini. AI review model: gpt-5.4. Checked against the originating brief and current primary/near-primary sources on 2026-05-24.
Quick answer
The first question is not “how do we redact this?” It is “do we need to send this at all?” If the answer is yes, then minimise, classify, segment and retain for as little time as the use case allows.
What this means
PII minimisation for LLM apps is a data-design problem, not a redaction-engineering problem. If you classify fields in your application schema as “needed for this task” vs “not needed”, you can strip unnecessary PII before it reaches the model. Redaction — masking or replacing identifiers after the fact — is a backup for fields you genuinely need but cannot avoid. Many teams build an elaborate redaction pipeline when they could have just not sent the data in the first place.
The ICO’s data minimisation principle (UK GDPR Article 5(1)(c)) says personal data must be “adequate, relevant and limited to what is necessary”. For LLM features, this means: if the model only needs to answer “what is the customer’s account tier?”, it should receive the account tier, not the full customer profile with name, address, phone number and transaction history.
Where teams misuse it
-
Sending the full customer record when only a field is needed. A billing chatbot that answers “when is my next payment?” does not need the customer’s name and address — it only needs the payment schedule. Teams dump the entire customer object into the prompt because it is available, not because it is necessary.
-
Building redaction after the fact instead of classification up front. A team builds a regex-based redaction pipeline that strips names and emails from the prompt before sending it to the API. But the prompt still contains account numbers, transaction IDs, support ticket text, and internal notes. The redaction pipeline was built without classifying what PII types actually exist in the data.
-
Treating “anonymised” output as retroactively compliant. A model generates a response that includes a customer’s full name because the name was in the retrieved context and the model used it. Even if the input was minimised, the output leaked it. Output-layer minimisation — telling the model not to include names in its response, or validating the output — is a separate step.
-
Skipping PII classification for retrieval content. Teams that build RAG systems often classify PII in the user prompt (and redact it), but forget that the retrieved documents may also contain PII. A vector database returns whatever chunks were stored, including customer names, account numbers, or health data that was embedded without classification.
Real scenario
A team builds a customer-support chatbot for an e-commerce platform. The product database has fields for: name, shipping address, email, phone, order history, payment method, account tier, support notes.
The team writes a system prompt: “You are a helpful support agent. Use the customer profile to answer questions.” They pass the full customer object as context. A user asks: “When is my next delivery?” The model replies with the delivery date — correct — but also includes: “I see you live at 123 Oak Street and your Visa card ending in 4242 was charged for this order.”
The model was not injected or jailbroken. It simply received a customer profile with more information than the question needed, and it generated a helpful response that accidentally disclosed PII. The fix was not a better redaction regex — it was a data-classification schema that only sent shipping-related fields for delivery questions.
Practical decision check
Before connecting a customer-facing LLM feature to production data, ask:
-
Which fields does the model genuinely need to answer this specific task? Not “which fields are available” — which fields are necessary. Define the minimum viable data schema for each workflow.
-
What PII categories exist in your data? Names, emails, addresses, phone numbers, account numbers, payment details, health information, internal notes — classify every field that could reach the model.
-
Is PII classification applied to your retrieval content, not just your prompts? If RAG sources include customer records, support tickets, or product databases, classify what those sources contain.
-
Does the output validator check for re-identified PII? A prompt that was successfully stripped of names may still generate a response that includes the customer’s name from context. Validate the output for identifiers that were not in the sanitised input.
-
What changes when the data-processing purpose changes? If a “billing AI” is later used for “personalised recommendations”, the minimum viable data changes. Redo the minimisation analysis.
Evidence and caveats
- Originating brief:
065-pii-handling-for-llm-apps-minimisation-before-redaction.md - Check date: 2026-05-24
- This draft uses current primary or near-primary sources only for the gap-fill citations requested by the brief.
- No hands-on product claim is made unless the source path is explicit in the text.
- If provider policy, retention, tool-use or citation docs change, this page should be re-checked before promotion.
Source and evidence notes
- ICO UK GDPR guidance — https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/
- NIST AI RMF — https://www.nist.gov/itl/ai-risk-management-framework
- OpenAI data usage / retention docs — https://platform.openai.com/docs/guides/your-data
- CISA and OWASP privacy/security guidance — https://www.cisa.gov/ and https://owasp.org/
Internal-link suggestions
- /cache/chat-history-is-not-memory-how-llm-apps-remember-users/
- /run/data-leakage-in-llm-apps-logs-prompts-files-and-vendor-retention/
- /run/ai-output-monitoring-what-to-log-sample-and-review/
Related reading
- chat-history-is-not-memory-how-llm-apps-remember-users
- data-leakage-in-llm-apps-logs-prompts-files-and-vendor-retention
- ai-output-monitoring-what-to-log-sample-and-review
Methodology
What was checked: originating brief plus current provider/standards documentation relevant to the topic.
What the sources were used for:
- to keep the claims cautious and specific;
- to date the guidance where policy or operational details can move;
- to avoid turning source notes into marketing copy.
Assumptions and limits:
- This is an evergreen concept page, not a benchmark report.
- No launch, outreach, affiliate, payment or tracking changes are implied.
- The draft is public-clean and omits internal ticket IDs by design.
Related guides
- data leakage in llm apps logs prompts files and vendor retention.md
- citation quality in ai answers source grounded does not mean source faithful
- provider data retention policies what api users should compare
Change Log
- 2026-05-27: Added direct source URLs to all named providers and services; added Change Log section. Content unchanged.