AI Risk · Privacy

Confidential Data in Input

Prompts submitted to AI systems may include confidential, sensitive, or proprietary information leading potential to privacy violations, data leakage, or IP exposure.

📋 Description

As large language models (LLMs) and other generative AI Systems become integrated into workflows, users often submit prompts containing proprietary source code, personally identifiable information (PII), financial statements, or business secrets. If these inputs are stored, logged, or used for model retraining, they can resurface in outputs, leak through inference attacks, or create legal risks under data protection laws like GDPR, HIPAA, or IP statutes.

The risk applies both in scenarios where confidential data should not be entered (e.g., internal users pasting sensitive data into ChatGPT) and where sensitive data must be input for a legitimate function (e.g., enterprise use cases for summarizing internal documents). Even when a system is designed to discard inputs, vulnerabilities (such as caching, misconfigured APIs, or misuse of APIs by developers) can result in unintended retention.

🔍 Public Examples and Common Patterns

- Incident 768: ChatGPT Implicated in Samsung Data Leak of Source Code and Meeting Notes: Samsung engineers are reported to have inadvertently leaked sensitive company data sometime in March 2023, including source code and internal meeting notes, by using ChatGPT to assist with tasks. The AI retained the inputted data, leading to a breach of confidentiality.

📐 External Framework Mapping

- IBM Risk Atlas: Confidential data in prompt risk for AI
Cite this page
Trustible. "Confidential Data in Input." Trustible AI Governance Insights Center, 2026. https://trustible.ai/ai-risks/confidential-data-in-input/

Manage AI Risk with Trustible

Trustible's AI governance platform helps enterprises identify, assess, and mitigate AI risks like this one at scale.

Explore the Platform