it's your old friend, deadly neurotoxin

whileloop@lemmy.world · 1 year ago

it's your old friend, deadly neurotoxin

fishos@lemmy.world · 1 year ago

Except, to my understanding, it wasn’t a LLM. It was a protein mapping model or something similar. And what they did was instead of telling it “run iterations and select the things the are benefitial based on XYZ”, they said “run iterations and select based on non-benefitial XYZ”.

They ran a protein coding type model and told it to prioritize HARMFUL results over good ones, giving it results that would cause harm.

Now, yes, those still need to be verified. But it wasn’t just “making things up”. It was using real data to iterate faster than a human would. Very similar to the Folding@HOME program.

Steeve@lemmy.ca · 1 year ago

Oh neat, thanks for the info! Wrongly assumed this was the latest of the ChatGPT clickbait articles jumping on LLM paranoia, I’ll correct my comment. Machine learning models like this have been around for a long time, I helped build one almost a decade ago for fraud detection (although it did suck lol), but I guess they’re only making headline news now.

fishos@lemmy.world · edit-2 1 year ago

No problem. I’m totally on board with the “LLMs aren’t the AI singularity” page. This one is actually kinda scary to me because it shows how easily you can take a model/simulation and instead of asking “how can you improve this?”, you can also ask “how can I make this worse?”. The same tool used for good can easily be used for bad when you change the “success conditions” around. Now it’s not the techs fault, of course. It’s a tool and how it’s used. But it shows how easily a tool like this can be used in the wrong ways with very little “malicious” action necessary.

Square Singer@feddit.de · 1 year ago

The thing is, if you run these tools to find e.g. cures to a disease it will also spit out 40k possible matches and of these there will be a handfull that actually work and become real medicine.

I guess, harming might be a little easier than healing, but claiming that the output is actually 40k working neurotoxins is clickbaity, missleading and incorrect.