AI Risk · Legal

Data Legality

Dataset should be reviewed for legal concerns including commercial license restrictions on public datasets and privacy restrictions for user data.

📋 Description

Data legality in AI refers to the legal frameworks and requirements governing the collection, use, and storage of data. This particularly relates to copyrighted and personal data. The scope of "personal data" can vary across jurisdictions and can range from information such as government ID numbers (e.g., social security number) to facial or voice recognition data, web activity, geolocation data.

The data used in AI system may also come from publicly available sources. Public data sources can present significant legal risks because the datasets may contain copyrighted or trademarked material. Such material can include text, images, video, or code. Using these materials without proper permission can result in copyright or trademark infringement and be subject to legal action.

Non-public data, especially user data, may also have restrictions on use. In many cases, terms of service and privacy policies will outline how the data can be used and must be adhered to by the organization. Using user data to train models without explicit permission may result in legal action, reputation damage and loss of trust.

🔍 Public Examples and Common Patterns

- Incident 199: Ever AI Reportedly Deceived Customers about FRT Use in App: Paravision AI (previously Ever AI) allegedly failed to inform customers about the use of their personal images to develop facial recognition software. The company was ordered to delete the facial recognition software by the US Federal Trade Commission.

- Shien Lawsuit: Three artists filed a lawsuit against Shien in 2023, alleging that the company used AI to replicate their art on merchandise. The artists claimed that the algorithm identifies trending art online, created near-identical copies, and used this to design merchandise.

🛡️ Recommended Mitigations

📐 External Framework Mapping

- IBM Risk Atlas : Data usage rights restrictions risk for AI
- Databricks AI Security Framework: 1.8 Legality of data

📚 References

- Copyright in the Age of AI
- Don't Get Sued Over Your AI Data: Legal Compliance for AI Companies
- Growth of AI Uses Raises Copyright Compliance Risk Worldwide
- Data versus Database
- AI data residency regulations and challenges

Cite this page

Trustible. "Data Legality." Trustible AI Governance Insights Center, 2026. https://trustible.ai/ai-risks/data-legality/

← All AI Risks Insights Center

Manage AI Risk with Trustible

Trustible's AI governance platform helps enterprises identify, assess, and mitigate AI risks like this one at scale.

Explore the Platform

Contact Us