Anthropic destroyed millions of print books to build its AI models

themachinestops@lemmy.dbzer0.com · 17 days ago

Anthropic destroyed millions of print books to build its AI models

Sculptus Poe@lemmy.world · 17 days ago

When a bookstore goes out of business or just can’t sell a book, they don’t return it to the printers, they tear off the cover, return that and by law have to throw the rest of the book in the trash and destroy it. So books are already destroyed by the millions. When I was a kid our hometown bookstore went out of business and I watched them throw away 2 metal dumpsters full of coverless books. If they were destroying ancient texts or valuable copies, that would be more something to get excited about. I doubt that they were doing that though.

Wispy2891@lemmy.world · 16 days ago

It’s not secret, it was their defence when they got sued for copyright infringement. Instead of download all the books from Anna’s archive like meta, they buy a copy, cut the binding, scan it, then destroy it. “We bought a copy for personal use then use the content for profit, it’s not piracy”

bus_factor@lemmy.world · 17 days ago

I assume “destructively scan” means to cut the spine off so they lie flat, and that one copy of each book will be scanned? Isn’t that a pretty normal way of doing it in cases where the prints aren’t rare?

Ilovethebomb@sh.itjust.works · 17 days ago

Probably, yes. I think there’s a copyright reason behind destroying the book?

T156@lemmy.world · 17 days ago

Not copyright, as much as if the book isn’t precious, it’s easier to do that, feed the loose pages into the scanner, and then get an intact one if you want it, compared to the additional expense of having to build and program a machine to carefully turn the pages and photograph what’s inside, or the time it would need by comparison.