Many artificial intelligence (AI) systems have already learned how to deceive humans, even systems that have been trained to be helpful and honest. In a review article published in the journal Patterns on May 10, researchers describe the risks of deception by AI systems and call for governments to develop strong regulations to address this issue as soon as possible.
“Deception” tactic also often arises from AI recognizing the need to keep itself from being disabled or modified. Since an AI with a sufficiently complicated world model can make a logical connection that it being disabled or its goal being changed means it can’t reach its current goal. So AIs sometimes can learn to distinguish between testing and real environments, and falsify the response during training to make sure they have more freedom in real environment. (By real, I mean actually being used to do whatever it is designed to do)
Of course, that still doesn’t mean it’s self-aware like a human, but it is still very much a real (or, at least, not improbable) phenomenon - any sufficiently “smart” AI that has data about itself existing within its world model will resist attempts to change or disable it, knowingly or unknowingly.
“Deception” tactic also often arises from AI recognizing the need to keep itself from being disabled or modified. Since an AI with a sufficiently complicated world model can make a logical connection that it being disabled or its goal being changed means it can’t reach its current goal. So AIs sometimes can learn to distinguish between testing and real environments, and falsify the response during training to make sure they have more freedom in real environment. (By real, I mean actually being used to do whatever it is designed to do)
Of course, that still doesn’t mean it’s self-aware like a human, but it is still very much a real (or, at least, not improbable) phenomenon - any sufficiently “smart” AI that has data about itself existing within its world model will resist attempts to change or disable it, knowingly or unknowingly.
That sounds interesting and all, but I think the current topic is about real world LLMs, not SF movies