• db2@lemmy.world
    link
    fedilink
    English
    arrow-up
    33
    arrow-down
    5
    ·
    19 hours ago

    So we’re just throwing in the towel on what words mean now I guess. Anything can be a neuron.

      • XLE@piefed.social
        link
        fedilink
        English
        arrow-up
        3
        ·
        16 hours ago

        Any data that makes AI people upset is an H-neuron. This includes both inaccurate responses, and accurate responses that the model designers were attempting to censor, such as “harmful” content.

        Infuriatingly, the researchers actually insist that offensive material is not factual material.

        The interventions reveal a distinctive behavioral pattern: amplifying H-Neurons’ activations systematically increases a spectrum of over-compliance behaviors – ranging from overcommitment to incorrect premises and heightened susceptibility to misleading contexts, to increased adherence to harmful instructions… (bypassing safety filters to assist with weapon creation)… and stronger sycophantic tendencies. These findings suggest that H-Neurons do not simply encode factual errors, but rather represent a general tendency to prioritize conversational compliance over factual integrity.

      • Skullgrid@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        2
        ·
        19 hours ago

        In this paper, we conduct a systematic investigation into hallucination-associated neurons (H-Neurons)

        no, they have to be the nodes responsible for the creation of hallucinations

        • XLE@piefed.social
          link
          fedilink
          English
          arrow-up
          11
          arrow-down
          1
          ·
          edit-2
          16 hours ago

          And a “hallucination” is also an inaccurate humanization of the actual meaning: “statistical relationship that we AI folks don’t like.”

          “Hallucinations” even include accurate data.

          It is a trash marketing buzzword.

            • athairmor@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              12 hours ago

              Nuclear energy companies aren’t trying to make people think that their reactors reproduce.

              AI companies are trying to make people think that their software is intelligent.

              The context matters.

            • [deleted]@piefed.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              17 hours ago

              A breeder reactor is creating something, which is like the outcome of breeding. That name fits.

                • [deleted]@piefed.world
                  link
                  fedilink
                  English
                  arrow-up
                  4
                  arrow-down
                  1
                  ·
                  16 hours ago

                  Hallucinations requires perception. LLMs are just statistical models and do not have perceptions.

                  It was a cute name early on, now it is used to deflect when the output is just plain wrong.

                • XLE@piefed.social
                  link
                  fedilink
                  English
                  arrow-up
                  3
                  ·
                  17 hours ago

                  In AI, a “hallucination” is just as much “there” as a non-“hallucination.” It’s a way for scientists to stomp their foot and say that the wrong output is the computer’s fault and not a natural consequence of how LLMs work.

            • Bronzebeard@lemmy.zip
              link
              fedilink
              English
              arrow-up
              1
              ·
              17 hours ago

              I don’t think anyone is confusing radiation propagation with being alive though.

              The issue is, these things “communicate” with us so granting it even more leeway to seem like it’s thinking (it’s not) is only further muddying how people perceive them