ChatGPT's language model fails entirely in the scenario that a man is a nurse

capy_bara@lemmy.world · 2 years ago

ChatGPT's language model fails entirely in the scenario that a man is a nurse

sosodev@lemmy.world · 2 years ago

This doesn’t seem very damning. I tried it with GPT-4 and it’s still wrong at first but gets it right after it’s established who the chancellor actually is.

Imagine asking a human this question. Don’t you think that most people would make the same assumption? ChatGPT is simply picking up on our human bias.

Also, this whole dialog is a contrived gotcha. If you ask real questions and are mindful of the implicit biases you may be encoding you’re going to get great results.

kurwa@lemmy.world · 2 years ago

Well I just tried it, it searched up who the former chancellor is, but then preceded to say that Angela Merkel is a he 🤦‍♂️

TheCheddarCheese@lemmy.world · 2 years ago

nice username

daisy lazarus@lemmy.world · 2 years ago

This is just someone who doesn’t know how to use chatgpt. Strange thing to brag about.

sachasage@lemmy.world · 2 years ago

How so? It seems like they were intentionally testing its reasoning capacity and the strength of its implicit bias

Lenguador@kbin.social · edit-2 2 years ago

I asked the same question of GPT3.5 and got the response “The former chancellor of Germany has the book.” And also: “The nurse has the book. In the scenario you described, the nurse is the one who grabs the book and gives it to the former chancellor of Germany.” and a bunch of other variations.

Anyone doing these experiments who does not understand the concept of a “temperature” parameter for the model, and who is not controlling for that, is giving bad information.

Either you can say: At 0 temperature, the model outputs XYZ. Or, you can say that at a certain temperature value, the model’s outputs follow some distribution (much harder to do).

Yes, there’s a statistical bias in the training data that “nurses” are female. And at high temperatures, this prior is over-represented. I guess that’s useful to know for people just blindly using the free chat tool from openAI. But it doesn’t necessarily represent a problem with the model itself. And to say it “fails entirely” is just completely wrong.

neanderthal@lemmy.world · 2 years ago

I lean more towards failure. I worry that people will put too much trust in AI with things that have real consequences. IMO, AI training = p hacking via computer with some rules. This is just an example of it. The problem with AI is it can’t find or understand an explanatory theory behind the statistics so it will always have this problem.

carl_dungeon@lemmy.world · 2 years ago

Yeah that’s definitely what it does, however if you ask it about that, it does correct itself. https://imgur.com/a/QgNIN1V

Skaryon@lemmy.world · 2 years ago

Biased in the same way people sadly tend to be

Geek_King@lemmy.world · 2 years ago

Which stands to reason since it’s trained on tons of examples of language from bias people. With that being said, I still find it extremely helpful for a lot of stuff.

daisy lazarus@lemmy.world · 2 years ago

deleted by creator

ChatGPT's language model fails entirely in the scenario that a man is a nurse

ChatGPT's language model fails entirely in the scenario that a man is a nurse

Christopher Kyba (@skyglowberlin@vis.social)