Many artificial intelligence (AI) systems have already learned how to deceive humans, even systems that have been trained to be helpful and honest. In a review article published in the journal Patterns on May 10, researchers describe the risks of deception by AI systems and call for governments to develop strong regulations to address this issue as soon as possible.
There’s a strong push-back against AI regulation within some quarters. Predictably, the issue seems to have split along polarized political lines. With right-wing leaning people not favoring regulation. They see themselves as ‘Accelerationist’ and those with concerns about AI as ‘Doomers’.
Meanwhile the unaddressed problems mount. AI can already deceive us, even when we design it not to do so, and we don’t why.
AI can already deceive us, even when we design it not to do so, and we don’t why.
The most likely explanation is that we keep acting like AI has intelligence and intent when describing the defects. AI doesn’t deceive, it returns inaccurate responses. That is because it is programmed to return answers like people do, and deceptions were included in the training data.
“Deception” tactic also often arises from AI recognizing the need to keep itself from being disabled or modified. Since an AI with a sufficiently complicated world model can make a logical connection that it being disabled or its goal being changed means it can’t reach its current goal. So AIs sometimes can learn to distinguish between testing and real environments, and falsify the response during training to make sure they have more freedom in real environment. (By real, I mean actually being used to do whatever it is designed to do)
Of course, that still doesn’t mean it’s self-aware like a human, but it is still very much a real (or, at least, not improbable) phenomenon - any sufficiently “smart” AI that has data about itself existing within its world model will resist attempts to change or disable it, knowingly or unknowingly.
Do you have a source on that one? My current understanding of all the model designs would lead me to believe that kind of “awareness” would be impossible.
However, I tend to align more with the skeptics in the article, as it still appears to be responding in a realistic manner and doesn’t demonstrate an ability to grow beyond the static structure of these models.
I wasn’t the user you originally replied to but I didn’t expect them to provide one and I totally agree with you, just another person that started believing that LLM is AI…
Conservatives are not supposed to be “accelerationists”. This is simply another shining example of regulatory capture by controlling the pockets of the right.
There’s a strong push-back against AI regulation within some quarters. Predictably, the issue seems to have split along polarized political lines. With right-wing leaning people not favoring regulation. They see themselves as ‘Accelerationist’ and those with concerns about AI as ‘Doomers’.
Meanwhile the unaddressed problems mount. AI can already deceive us, even when we design it not to do so, and we don’t why.
The most likely explanation is that we keep acting like AI has intelligence and intent when describing the defects. AI doesn’t deceive, it returns inaccurate responses. That is because it is programmed to return answers like people do, and deceptions were included in the training data.
“Deception” tactic also often arises from AI recognizing the need to keep itself from being disabled or modified. Since an AI with a sufficiently complicated world model can make a logical connection that it being disabled or its goal being changed means it can’t reach its current goal. So AIs sometimes can learn to distinguish between testing and real environments, and falsify the response during training to make sure they have more freedom in real environment. (By real, I mean actually being used to do whatever it is designed to do)
Of course, that still doesn’t mean it’s self-aware like a human, but it is still very much a real (or, at least, not improbable) phenomenon - any sufficiently “smart” AI that has data about itself existing within its world model will resist attempts to change or disable it, knowingly or unknowingly.
That sounds interesting and all, but I think the current topic is about real world LLMs, not SF movies
Claude 3 understood it was being tested… It’s very difficult to fathom that that’s a defect…
Do you have a source on that one? My current understanding of all the model designs would lead me to believe that kind of “awareness” would be impossible.
https://arstechnica.com/information-technology/2024/03/claude-3-seems-to-detect-when-it-is-being-tested-sparking-ai-buzz-online/
Still not proof of intelligence to me but people want to believe/scare themselves into believing that LLMs are AI.
Thanks for following up with a source!
However, I tend to align more with the skeptics in the article, as it still appears to be responding in a realistic manner and doesn’t demonstrate an ability to grow beyond the static structure of these models.
I wasn’t the user you originally replied to but I didn’t expect them to provide one and I totally agree with you, just another person that started believing that LLM is AI…
Ah, my bad I didn’t notice, but do still appreciate the article/source!
Perhaps, but the researchers say the people who developed the AI don’t know the mechanism whereby this happens.
That’s because they have also fallen into the “intelligence” pitfall.
No one knows why any of those DNNs work, that’s not exactly new
Conservatives are not supposed to be “accelerationists”. This is simply another shining example of regulatory capture by controlling the pockets of the right.
Do you want to explain why you think this? It seems very reductive, basically saying anyone that doesn’t agree with you is an idiot.
I’m very left leaning and against regulation because it will only serve big companies by killing the open source scene.
The bigger defining factor seems to be tech literacy and not political alignment.
Regulate businesses not technologies.