DNYUZ
  • Home
  • News
    • U.S.
    • World
    • Politics
    • Opinion
    • Business
    • Crime
    • Education
    • Environment
    • Science
  • Entertainment
    • Culture
    • Music
    • Movie
    • Television
    • Theater
    • Gaming
    • Sports
  • Tech
    • Apps
    • Autos
    • Gear
    • Mobile
    • Startup
  • Lifestyle
    • Arts
    • Fashion
    • Food
    • Health
    • Travel
No Result
View All Result
DNYUZ
No Result
View All Result
Home News

Anthropic cut up millions of used books to train Claude — and downloaded over 7 million pirated ones too, a judge said

June 25, 2025
in News
Anthropic cut up millions of used books to train Claude — and downloaded over 7 million pirated ones too, a judge said
493
SHARES
1.4k
VIEWS
Share on FacebookShare on Twitter
anthropic
Anthropic spent “many millions of dollars” buying used print books, then stripped off the bindings, cut the pages, and scanned them into digital files.

VCG/VCG via Getty Images

To build AI chatbot Claude, Anthropic “destructively scanned” millions of copyrighted books, wrote a judge on Monday.

Ruling in a closely-watched AI copyright case, Judge William Alsup of the Northern District of California analyzed how Anthropic sourced data for model training purposes, including from digital and physical books.

Companies like Anthropic require vast amounts of input to develop their large language models, so they’ve tapped sources from social media posts to videos to books. Authors, artists, publishers, and other groups contend that the use of their work for training amounts to theft.

Alsup detailed Anthropic’s training process with books: The OpenAI rival spent “many millions of dollars” buying used print books, which the company or its vendors then stripped of their bindings, cut the pages, and scanned into digital files.

Alsup wrote that millions of original books were then discarded, and the digital versions stored in an internal “research library.”

The judge also wrote that Anthropic, which is backed by Amazon and Alphabet, downloaded more than 7 million pirated books to train Claude.

Alsup wrote that Anthropic’s cofounder, Ben Mann, downloaded “at least 5 million copies of books from Library Genesis” in 2021 — fully aware that the material was pirated. A year later, the company “downloaded at least 2 million copies of books from the Pirate Library Mirror” also knowing they were pirated.

Alsup wrote that Anthropic preferred to “steal” books to “avoid ‘legal/practice/business slog,’ as cofounder and CEO Dario Amodei put it.”

Last year, a trio of authors sued Anthropic in a class-action lawsuit, saying that the company used pirated versions of their books without permission or compensation to train its large language models.

Judge says training Claude on books was fair use, but piracy wasn’t

Alsup ruled that Anthropic’s use of copyrighted books to train its AI models was “exceedingly transformative” and qualified as fair use, a legal doctrine that allows certain uses of copyrighted works without the copyright owner’s permission.

“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different,” he wrote.

The company’s decision to digitize millions of print books it had purchased fell under fair use, Alsup wrote.

“All Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies,” he wrote.

An Anthropic spokesperson said that the company is pleased with Alsup’s ruling on using books to train LLMs.

The spokesperson said in a statement that this approach is “consistent with copyright’s purpose in enabling creativity and fostering scientific progress.”

But Alsup drew a firm line when it came to piracy.

“Anthropic had no entitlement to use pirated copies for its central library,” Alsup wrote. “Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.”

Judge Alsup’s ruling that training AI models on copyrighted books is fair use is one of the first of ips kind.

His decision comes amid a wave of lawsuits from artists, filmmakers, authors, and news outlets against major AI players like OpenAI.

While creators say training AI models on their copyrighted work without permission infringes on their rights, AI execs argue they haven’t violated copyright laws because the training falls under fair use.

Earlier this month, Disney sued AI image generator Midjourney, saying the tech company ripped off famous characters in properties ranging from “Star Wars” to “The Simpsons.”

The post Anthropic cut up millions of used books to train Claude — and downloaded over 7 million pirated ones too, a judge said appeared first on Business Insider.

Share197Tweet123Share
CEASEFIRE: Can a pause in fighting lead to lasting peace where conflicts have become the norm?
News

CEASEFIRE: Can a pause in fighting lead to lasting peace where conflicts have become the norm?

by Fox News
June 25, 2025

NEWYou can now listen to Fox News articles! President Donald Trump brokered an historic ceasefire agreement between Israel and Iran ...

Read more
Entertainment

Country singer Parker McCollum’s dreams all came true. A new self-titled album brought new ones

June 25, 2025
News

What is the Palestine Action group, and why is the UK banning it?

June 25, 2025
News

Nissan Unleashes 460 HP Armada NISMO

June 25, 2025
Arts

With ‘F1,’ mega-producer Jerry Bruckheimer is still in the driver’s seat

June 25, 2025
Iran’s Attack on a U.S. Base in Qatar is a Nightmare Come True for Gulf States

Iran’s Attack on a U.S. Base in Qatar is a Nightmare Come True for Gulf States

June 25, 2025
The week’s bestselling books, June 29

The week’s bestselling books, June 29

June 25, 2025
Slain Anaheim security guard remembered as ‘protector,’ father of 5

Slain Anaheim security guard remembered as ‘protector,’ father of 5

June 25, 2025

Copyright © 2025.

No Result
View All Result
  • Home
  • News
    • U.S.
    • World
    • Politics
    • Opinion
    • Business
    • Crime
    • Education
    • Environment
    • Science
  • Entertainment
    • Culture
    • Gaming
    • Music
    • Movie
    • Sports
    • Television
    • Theater
  • Tech
    • Apps
    • Autos
    • Gear
    • Mobile
    • Startup
  • Lifestyle
    • Arts
    • Fashion
    • Food
    • Health
    • Travel

Copyright © 2025.