• crater2150@feddit.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    What exactly would you checksum? All intermediate states that weren’t committed, and all test run parameters and outputs? If so, how would you use that to detect an LLM? The current agentic LLM tools also do several edits and run tests for the thing they’re writing, then edit more until their tests work.

    So the presence of test runs and intermediate states isn’t really indicative of a human writing code and I’m skeptical that distinguishing between steps a human would do and steps an LLM would do is any easier or quicker than distinguishing based on the end result.

    • Jyek@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      You could time stamp changes and progress to a file. Record results of tests and output and give an approximate algorithmic confidence rating about how bespoke the process of writing that code was. Even agentic AI rapidly spits out code like a machine would where humans take time and think about things as they go. They make typos and go back and correct them. Code tests fail and debugging looks different between an agent and a human. We need to fingerprint how agents write code and use agentic code processed through this sort of validation looks versus what it looks like for humans to do the same.

      • crater2150@feddit.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        This basically amounts to a key/interaction logger in the IDE. I’d suspect this would prevent many people contributing to projects using something like that, at least I wouldn’t install such a plug-in.

        • Jyek@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          It would be a keylogger within the IDE. How else do you prove you were the one doing the work? Otherwise, AI slop. I guess pick your poison.