top of page

Highlights from Anthropic and Kadrey


warner v nealy supreme court case
Photo by Andreas Grönberg via Pixabay

In late June, two district court judges in the Northern District of California issued two widely contrasting decisions on whether generative AI companies are shielded by fair use after they used copyrighted works as training data without permission. While both judges ruled in favor of the generative AI companies, the reasonings differed greatly. Read on for highlights from Bartz v. Anthropic and Kadrey v. Meta and their implications.

 

Pirates and Libraries

 

Defendant Anthropic downloaded millions of pirated books when it began building a large digital “research library” of content to use as training data for various large language models (LLMs), including the well-known “Claude” AI model. Anthropic later bought millions of print books in bulk from major book distributors and retailers. It then cut up the books, scanned the pages, and discarded the paper originals. Both the pirated books and the purchased books in digital form were placed into the “research library” and used to train LLMs. Plaintiffs alleged copyright infringement since Anthropic did not obtain any licenses or permission for either creating the library or using the works as training data.

 

In evaluating fair use, the court applied the statutory four fair use factors to Anthropic’s conduct. With regard to the first fair use factor, the “purpose and character” of the use, Judge Alsup held that using the plaintiffs’ works to train LLMs was “spectacularly” transformative. The digitization of the purchased copies was also transformative because Anthropic was entitled to “dispose of each copy as it saw fit.”

 

With regard to the fourth fair use factor, the effect of the use on the potential market for the original work, the court rejected the authors’ argument that training LLMs will result in an explosion of competing works and harm an emerging market for licensing their works for LLM training data. The court held that “[t]his is not the kind of competitive or creative displacement that concerns the Copyright Act” and the Copyright Act does not entitle authors to exploit such a market for their books. The court’s ruling distinguished between the various categories of copies, finding that using pirated and purchased copies of the plaintiffs’ books for training purposes was fair use, digitizing the legally purchased print books was fair use, but using the pirated books to build the “research library” was not fair use.

 

Forgetting Market Harm

 

Plaintiff authors in Kadrey filed a class action and sued Meta for illegally downloading their books from “shadow libraries” and for training its LLM program, known as “Llama,” on these illegally obtained copies. Notably, the plaintiffs also alleged that Llama could reproduce small snippets of their books in its output and that Meta had harmed their ability to license their books for use as training data for LLMs. Meta raised the defense of fair use.

 

At the core of the matter, Judge Chhabria considered whether Meta’s feeding copyright-protected materials into the Llama AI model was illegal or protected by fair use. In examining the first factor, the judge held that there was no serious question that Meta’s use of the plaintiffs’ books was transformative because Meta was training LLMs that could generate various types of text and perform many functions.

 

With regard to the fourth factor regarding market harm, the court expressed surprise that the plaintiffs failed to argue that Meta’s LLM would cause market dilution by flooding the market with similar, competing works that could potentially generate hundreds of books in a short time with little effort. Because the record and evidence presented did not support the plaintiffs’ assertions of Meta’s harm via direct substitution, the court granted summary judgement on the fair use defense to Meta.

 

Implications


While these two summary judgment decisions are fact specific, they are illustrative of the issues judges will face and may influence other courts facing similar questions. With regard to factor one, judges may continue to find the use of copyrighted works as training data is transformative, depending on the nature of the specific use. How parties litigate and courts rule on the fourth factor, “the effect of the use upon the potential market for or value of the copyrighted work,” may play an outsize role in the analysis, especially as licensing models continue to evolve.



nancy mertzel - partner of mertzel law pllc

Nancy J. Mertzel

Mertzel Law PLLC

1204 Broadway, 4th Floor, New York, NY, 10001

(646) 965-6900

Offices in New York and New Jersey

Comments


New York Office: 

1204 Broadway, 4th Floor

New York, NY 10001

New Jersey Office: 
25 Pompton Avenue, Suite 101
Verona, NJ 07044
44

Mertzel Law PLLC

info@mertzel-law.com

(646) 965-6900

© 2017-2021 Mertzel Law PLLC

  • Facebook Clean
  • Twitter Clean
  • LinkedIn Clean
bottom of page