AI Risk · Legal

Copyright and IP Violations

Generative AI models can output content that violates copyright or IP laws.

📋 Description

Using generative AI, such as models that produce text, images, or music, may lead to copyright and intellectual property (IP) violations. These AI systems are often trained on vast datasets that include copyrighted or legally protected material, raising legal concerns over the training process and the generated outputs.

AI systems like ChatGPT and DALL-E are designed to learn from large datasets, which may include protected works without explicit permission from the copyright holders. For example, AI can produce images that mimic the style of famous artists like Van Gogh or generate text that closely resembles passages from popular books such as J.K. Rowling's Harry Potter series, potentially violating IP laws. The issue is compounded by the difficulty in determining whether the AI’s output is sufficiently original or constitutes a derivative work of the copyrighted material on which it was trained.

Legal experts are still debating the extent to which AI-generated works are protected under current copyright laws. In the U.S., the Copyright Office maintains that only works created by humans are eligible for copyright protection, which means AI-generated content cannot be copyrighted. However, the UK offers limited protection for computer-generated works under certain conditions.

🔍 Public Examples and Common Patterns

- AIID Incident 996: Meta Allegedly Used Books3, a Dataset of 191,000 Pirated Books, to Train LLaMA AI: Meta and Bloomberg allegedly used Books3, a dataset containing 191,000 pirated books, to train their AI models, including LLaMA and BloombergGPT, without author consent. Lawsuits from authors such as Sarah Silverman and Michael Chabon claim this constitutes copyright infringement. Books3 includes works from major publishers like Penguin Random House and HarperCollins. Meta argues its AI outputs are not "substantially similar" to the original books, but legal challenges continue.

- AIID Incident 240: GitHub Copilot, Copyright Infringement and Open Source Licensing: Users of GitHub Copilot can produce source code subject to license requirements without attributing and licensing the code to the rights holder.

🛡️ Recommended Mitigations

📐 External Framework Mapping

- IBM Risk Atlas: Copyright infringement risk for AI

📚 References

- Michigan State University
- Harvard Law School
- MIT Sloan

Cite this page

Trustible. "Copyright and IP Violations." Trustible AI Governance Insights Center, 2026. https://trustible.ai/ai-risks/copyright-ip-violations/

← All AI Risks Insights Center

Manage AI Risk with Trustible

Trustible's AI governance platform helps enterprises identify, assess, and mitigate AI risks like this one at scale.

Explore the Platform

Contact Us