Michael Ten @lemmy.world to Technology@lemmy.worldEnglish · 2 years agoOpenAI transcribed over a million hours of YouTube videos to train GPT-4www.theverge.comexternal-linkmessage-square45linkfedilinkarrow-up1167arrow-down112cross-posted to: technology@lemmy.ml
arrow-up1155arrow-down1external-linkOpenAI transcribed over a million hours of YouTube videos to train GPT-4www.theverge.comMichael Ten @lemmy.world to Technology@lemmy.worldEnglish · 2 years agomessage-square45linkfedilinkcross-posted to: technology@lemmy.ml
minus-squareDefaced@lemmy.worldlinkfedilinkEnglisharrow-up15arrow-down2·2 years agoYou completely miss my point, are you saying data such as copyrighted published works and medical records are free? Because I did not in any way consent to sharing medical records to OpenAI https://www.businessinsider.com/openai-chatgpt-generative-ai-stole-personal-data-lawsuit-children-medical-2023-6?op=1 Now I realize this is an alleged offense, but it’s still fucked up. As for wanting to be the first to make a LLM, I have no desire to put myself into that amount of responsibility and liability. Sam Altman is chasing money and nothing more.
minus-squareBreakDecks@lemmy.mllinkfedilinkEnglisharrow-up11arrow-down2·2 years agoThere’s a distinct difference between quotation and plagiarism. A search engine does the former, LLMs do the latter.
minus-squareKnock_Knock_Lemmy_In@lemmy.worldlinkfedilinkEnglisharrow-up3arrow-down5·2 years agoNo. If you write a truly unique combination of words then an LLM will be very unlikely to reproduce them. An LLM is only likely to plagiarise you if your writing is similar to others.
minus-squareKnock_Knock_Lemmy_In@lemmy.worldlinkfedilinkEnglisharrow-up2arrow-down3·2 years agohttps://blog.gdeltproject.org/do-llms-truly-create-or-merely-arrange-just-how-much-of-an-llms-writing-is-original/
minus-squareBreakDecks@lemmy.mllinkfedilinkEnglisharrow-up1·2 years ago The differences between human and machine-generated text overlap support the image of LLMs as more “arrangers” than “creators” of text. So plagiarism…
minus-squareKnock_Knock_Lemmy_In@lemmy.worldlinkfedilinkEnglisharrow-up1·2 years agoIt only plagiarises you if you write something similar to lots of other people. Write something original and, even if it is in their training dataset, LLMs are highly unlikely to reproduce it.
minus-squareEurekaStockade@lemmy.worldlinkfedilinkEnglisharrow-up8arrow-down1·2 years agoFuck Google too
Removed by mod
You completely miss my point, are you saying data such as copyrighted published works and medical records are free? Because I did not in any way consent to sharing medical records to OpenAI https://www.businessinsider.com/openai-chatgpt-generative-ai-stole-personal-data-lawsuit-children-medical-2023-6?op=1
Now I realize this is an alleged offense, but it’s still fucked up. As for wanting to be the first to make a LLM, I have no desire to put myself into that amount of responsibility and liability. Sam Altman is chasing money and nothing more.
There’s a distinct difference between quotation and plagiarism. A search engine does the former, LLMs do the latter.
No. If you write a truly unique combination of words then an LLM will be very unlikely to reproduce them.
An LLM is only likely to plagiarise you if your writing is similar to others.
[citation needed]
https://blog.gdeltproject.org/do-llms-truly-create-or-merely-arrange-just-how-much-of-an-llms-writing-is-original/
So plagiarism…
It only plagiarises you if you write something similar to lots of other people.
Write something original and, even if it is in their training dataset, LLMs are highly unlikely to reproduce it.
Fuck Google too