KittiesAreCute!
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Michael Ten @lemmy.world to Technology@lemmy.worldEnglish · 2 years ago

OpenAI transcribed over a million hours of YouTube videos to train GPT-4

www.theverge.com

external-link
message-square
45
link
fedilink
  • cross-posted to:
  • technology@lemmy.ml
155
external-link

OpenAI transcribed over a million hours of YouTube videos to train GPT-4

www.theverge.com

Michael Ten @lemmy.world to Technology@lemmy.worldEnglish · 2 years ago
message-square
45
link
fedilink
  • cross-posted to:
  • technology@lemmy.ml
How OpenAI, Google, and Meta deal with the limits of data online.
  • Defaced@lemmy.world
    link
    fedilink
    English
    arrow-up
    32
    arrow-down
    5
    ·
    2 years ago

    I fucking hate how this company is just taking data and metrics without any permissions and repercussions. OpenAI and Sam Altman can fuck right off. Same with Microsoft and copilot and every other company rushing for the AI/ML arms race, its disgusting and irresponsible.

    We joke about skynet and terminators and whatnot, but the reality is OpenAI is legitimately moving towards that end with no safety precautions, no thought put into the economic and humanitarian impacts they’re going to cause. Capitalism in general (and yes I’m going to be that guy and say it) simply cannot survive the AI/ML age of humanity without evolving.

    • afraid_of_zombies@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      2
      ·
      2 years ago

      Going to start keeping score. Mark you down in the AI is going to be amazingly powerful camp.

    • Dkarma@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      25
      ·
      2 years ago

      Removed by mod

      • Defaced@lemmy.world
        link
        fedilink
        English
        arrow-up
        15
        arrow-down
        2
        ·
        2 years ago

        You completely miss my point, are you saying data such as copyrighted published works and medical records are free? Because I did not in any way consent to sharing medical records to OpenAI https://www.businessinsider.com/openai-chatgpt-generative-ai-stole-personal-data-lawsuit-children-medical-2023-6?op=1

        Now I realize this is an alleged offense, but it’s still fucked up. As for wanting to be the first to make a LLM, I have no desire to put myself into that amount of responsibility and liability. Sam Altman is chasing money and nothing more.

      • BreakDecks@lemmy.ml
        link
        fedilink
        English
        arrow-up
        11
        arrow-down
        2
        ·
        2 years ago

        There’s a distinct difference between quotation and plagiarism. A search engine does the former, LLMs do the latter.

        • Knock_Knock_Lemmy_In@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          5
          ·
          2 years ago

          No. If you write a truly unique combination of words then an LLM will be very unlikely to reproduce them.

          An LLM is only likely to plagiarise you if your writing is similar to others.

          • BreakDecks@lemmy.ml
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            2 years ago

            [citation needed]

            • Knock_Knock_Lemmy_In@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              3
              ·
              2 years ago

              https://blog.gdeltproject.org/do-llms-truly-create-or-merely-arrange-just-how-much-of-an-llms-writing-is-original/

              • BreakDecks@lemmy.ml
                link
                fedilink
                English
                arrow-up
                1
                ·
                2 years ago

                The differences between human and machine-generated text overlap support the image of LLMs as more “arrangers” than “creators” of text.

                So plagiarism…

                • Knock_Knock_Lemmy_In@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  2 years ago

                  It only plagiarises you if you write something similar to lots of other people.

                  Write something original and, even if it is in their training dataset, LLMs are highly unlikely to reproduce it.

      • EurekaStockade@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        2 years ago

        Fuck Google too

Technology@lemmy.world

technology@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.world

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


  • @L4s@lemmy.world
  • @autotldr@lemmings.world
  • @PipedLinkBot@feddit.rocks
  • @wikibot@lemmy.world
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 3.77K users / day
  • 3.77K users / week
  • 14.4K users / month
  • 29.4K users / 6 months
  • 2 local subscribers
  • 80.3K subscribers
  • 16.2K Posts
  • 595K Comments
  • Modlog
  • mods:
  • L3s@lemmy.world
  • enu@lemmy.world
  • Technopagan@lemmy.world
  • L4sBot@lemmy.world
  • L3s@hackingne.ws
  • BE: 0.19.15
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org