Anthropic shredded millions of physical books to train its Claude AI model — and new documents suggest that it was well aware of just how bad it would look if anyone found out.
The secret initiative, called Project Panama, was unearthed last summer in a lawsuit brought by a group of authors against Anthropic, which the company eventually agreed to settle for $1.5 billion in August.
Since then, more about what happened behind the scenes has come to light, after a district judge ordered more case documents be unsealed, according to new reporting from the Washington Post.
The documents revealed how Anthropic leadership viewed books as “essential” to training its AI models, with one co-founder stating it would teach the bots “how to write well” instead of mimicking “low quality internet speak.”
Buying, scanning, and then destroying millions of used books was one way of doing this, and it had the advantage of both being cheap and very possibly legal. The destructive practice exploited a legal concept known as first-sale doctrine, which allows buyers to do what they want with their purchase without a copyright holder interfering. (This is what allows the secondhand media market to exist.) And by converting the files from paper to digital, a judge in August found that this contributed to Anthropic’s use of the original texts being “transformative,” crediting the startup with not creating more physical copies or redistributing existing ones. This was enough to be considered fair use, and in all, the book-shredding allowed the company to avoid paying authors for their work.
From the way the lawsuit documents tell it, Anthropic turned literally ripping off books into an art form. It used a “hydraulic powered cutting machine” to “neatly cut” the millions of books it got from used book retailers, and then scanned the pages “on high speed, high quality, production level scanners.” Then a recycling company would be scheduled to pick up the eviscerated volumes — because you wouldn’t want to be wasteful, after all.
If this sounds ethically dubious to you, you’re not alone. Anthropic itself sounded self-conscious about how its destructive practice might look, a ready-made symbol of how many perceive the industry’s tech to be destroying the arts.
“We don’t want it to be known that we are working on this,” a recently unsealed internal planning document from 2024 stated, as quoted by WaPo.
Before it turned to physical books, the company first relied on digital ones. In 2021, Anthropic co-founder Ben Mann took it upon himself to download millions of books from LibGen, an online “shadow library” of freely available, pirated texts. The next year, Mann praised a new website called Pirate Library Mirror, which was upfront about the fact that it “deliberately” violated copyright law in most countries. Sending a link to the website to other employees, Mann enthused about the site’s launch, “just in time!!!” per WaPo. (Anthropic denied using the pirated books to train any of its commercial models. But while Anthropic’s shredding of used books was deemed legal, the use of pirated ones was not, leading to the $1.5 billion settlement.)
Anthropic wasn’t the only company turning books inside-out. In another author lawsuit, documents revealed how Mark Zuckerberg’s Meta also pilfered millions of books from shadow libraries like LibGen, which some employees realized was a little suspect.
“Torrenting from a corporate laptop doesn’t feel right,” one Meta engineer wrote in 2023 with a grinning emoji.
Another PR-conscious employee warned about the blowback that could follow if the practice got out.
“If there is media coverage suggesting we have used a dataset we know to be pirated, such as LibGen, this may undermine our negotiating position with regulators on these issues,” they wrote in an internal communication.
More on AI: Top Anthropic Researcher No Longer Sure Whether AI Is Conscious
The post Anthropic Knew the Public Would Be Disgusted by How It Was Destroying Physical Books, Secret Documents Reveal appeared first on Futurism.




