So when’s the ruling against OpenAI and the like using the same copyrighted material to train their models
But OpenAI not being allowed to use the content for free means they are being prevented from making a profit, whereas the Internet Archive is giving away the stuff for free and taking away the right of the authors to profit. /s
Disclaimer: this is the argument that OpenAI is using currently, not my opinion.
deleted by creator
the other is fair use
That’s very much up for debate still.
(I am personally still undecided)
I think that’s the difference right there.
One is up for debate, the other one is already heavily regulated currently. Libraries are generally required to have consent if they are making straight copies of copyrighted works. Whether we like it or not.
What AI does is not really a straight up copy, which is why it’s fuzzy, and much harder to regulate without stepping in our own toes, specially as tech advances and the difference between a human reading something and a machine doing it becomes harder and harder to detect.
If OpenAI can get away with going through copy-righted material, then the answer to piracy is simple: round up a bunch of talented Devs from the internet who are writing and training AI models, and let’s make a fantastic model trained on what the internet archive has. Tell you what, let Mistral’s engineers lead that charge, and put an AGPL license on the project so that companies can’t fuck us over.
I refuse to believe that nobody has thought of this yet
An AI trained on old Internet material would be like a synthetic Grandpa Simpson:
“In my day we said ‘all your base’ and laughed all day long, because it took all day to download the video.”
Not a surprise, but still somehow crushing. It’s a loss for us all.
Artificial scarcity at its finest. Imagine recording a song digitally, then pretending there are a limited amount of copies of that song in existence. Then you sell an agreement to another person that says they have to pretend there is only a certain made up number of copies that they bought, and if they allow more than that number of people to listen to those copies at rhe same time, they will get sued for “stealing” additional pretend copies?
I hope everybody can see how this is the insane and pathetic result of Capitalism’s unrelenting drive to commodify everything it possibly can in the pursuit of profit.
As always, the solution is sailing the high seas. Throughout history, those who created or saved illegal copies/translations of literature and art were important to preserving and furthering human knowledge.
Many incredibly powerful people, empires, and countries have tried very hard to suppress that, but they keep failing. You cannot suppress the human drive for curiosity and knowledge.
True, and the fleet is big and strong. There are many people seeding hundreds of terabytes of books/research papers/etc. The knowledge will not be lost. Yarr, can’t catch me in the high seas…
Ah, I see we’re burning the Library of Alexandria again… Just as with last time, the survival of texts will rely upon copies.
Oh sure I want to read copyright books it’s an issue, but OpenAI does it and it’s vital to their business so they can keep going.
Fuck Copyright.
A system for distributing information and rewarding it’s creators should not be one based on scarcity, given that it costs nothing to copy and distribute information.
It was fine when the limited duration was a reasonable number of years. Anything over 30 years max before being in the public domain is too long.
Yeah. In a better world where the US court system doesn’t get weaponized and rulings aren’t delayed for years or decades, I would argue 8 to 15 years is the reasonable number, depending on the type of information being copyrighted.
what does warrior do? The git readme seems to just be setup instructitons
If only the readme clearly said what it was with a link you could click…
“We are reviewing the court’s opinion and will continue to defend the rights of libraries to own, lend, and preserve books.”
Unpopular opinion: They stepped out of their fucking lane. There are already laws that protect actual libraries, in fact most nations have laws to ensure libraries have access to all locally published works.
One good thing to come of this is I’ve now joined my national and local libraries.
deleted by creator
Other libraries have licenses. And follow them.
Internet archive digitized actual books and lent out copies (which was already 100% not legal under current law), then thought it was a good idea to just say “fuck it” and remove the thin veil of legitimacy that kept publishers from caring too much by removing the “one copy at a time per book” policy and daring the publishers to do something about it.
deleted by creator
Any digitized lending was always illegal.
The law was abundantly clear. You cannot distribute wholesale copies of someone else’s work. Publishers didn’t bother because the scale was small and they didn’t want to take the PR hit for a scale that didn’t matter.
The first sale doctrine, necessarily, can only possibly apply to a physical object. There is no such thing as a “single copy” of a digital object. Every time that “single copy” moves is a new copy. There is no legal framework in the US that even acknowledges the premise of a digital copy. It’s always a license.
You need new laws to apply to the digital world. There is absolutely zero room for ambiguity that what the Internet archive did never in any way was protected. This ruling was a literal guarantee the minute the Internet Archive removed their (unambiguously not in any way legal) pretense of a “single copy”. There isn’t a court in the country that would even consider ruling any other way, because the law is well beyond clear. This ruling happened because the Internet Archive forced it to happen. If they had left open mass scale piracy to pirate sites they would have been fine.
If their lawyers advised them that there was even a possibility that this argument could work, they should be disbarred. They would be better off spending their money on lobbying for better laws than pursuing a case less likely than winning the power ball jackpot 5 draws in a row.
deleted by creator