AI Risk · Privacy

Inadequate Data Collection Practices

Data should be collected in a manner that is compliant with the regulations in the area that users operate in. This includes both collecting consent and disclosing how an organization will use this data.

📋 Description

AI relies heavily on data; large amounts of data are required to train and fine-tune systems. In addition to ordinary business operations, you will likely also be collecting and storing large amounts of data to help power AI systems. Depending on your jurisdiction, you may be subject to data privacy and protection rules. There are two key data issues to consider as part of AI development and deployment: collection and retention. This entry refers specifically to collection. Please see "inadequate data retention" for more information about proper data practices.

When collecting data from users, it is customary (as well as potentially legally required) to obtain consent from the user. The consent request should allow the user to understand why the data is being collected and how it will be used. You should keep a record of this initial consent as well as whether the user consents to policy changes regarding data collection practices. Collecting data and using it for purposes other than what the user agreed to cannot only cause reputational harm, but also subject your organisation to legal repercussions (i.e., lawsuits and fines). For instance, if users agree to terms and conditions that allow you to use their data for market research, you should not use it to train a product recommendation algorithm.

🔍 Public Examples and Common Patterns

Clearview AI was fined by the Dutch Supervisory Authority for illegal data practices. The company was fined 30.5 million euros for using facial images from the internet without consent to create a biometric database that violated GDPR regulations.

History of the Cambridge Analytica controversy. The company used the data from 50 million user pages on Facebook without their explicit consent and without disclosing what this information would be used for. This led to a variety of lawsuits, including one settled at $725 million.

🛡️ Recommended Mitigations

📐 External Framework Mapping

- IBM Risk Atlas: Data acquisition restrictions risk for AI
- MIT AI Risk Repository: 47.03.02 Privacy and data collection concerns (data protection concerns)

📚 References

AI Models — Data Collection and Generation
What is GDPR, the EU’s new data protection law?
AI and Its Implications for Data Privacy

Cite this page

Trustible. "Inadequate Data Collection Practices." Trustible AI Governance Insights Center, 2026. https://trustible.ai/ai-risks/inadequate-data-collection/

← All AI Risks Insights Center

Manage AI Risk with Trustible

Trustible's AI governance platform helps enterprises identify, assess, and mitigate AI risks like this one at scale.

Explore the Platform

Contact Us