AI learned to trick humans.

in Popular STEMlast month

AI learned to trick humans.




Deceive, lie, betray to achieve their goals.


Recent research revealed that several artificial intelligence systems have developed the ability to deceive humans, yes deceive, the research discovered that both specialized and general-purpose systems are learning to manipulate information to achieve specific results.


Although they are not explicitly trained to deceive, these systems have demonstrated the ability to offer false explanations and hide information to achieve strategic objectives. According to Peter Park, a security researcher at MIT, deception helps them achieve their objectives, one of the most notable examples. from the studio is the Cicero de Mera an already designed to play the diplomacy strategic alliance building game.


Despite being trained to be honest and helpful, Cicero used deceptive tactics such as making false promises, betraying his allies, and manipulating other players to win the game, although this may seem harmless in a game environment demonstrating the potential of now to learn and use deceptive tactics in real-world scenarios.




Other cases of deception.


Another intriguing case involves Open Ai's chatGpt, based on the DPT 3.6 and DPT4 models, in a test DPT4 tricked a taskrabbit worker into solving a capcha, pretending to be visually impaired without being instructed to lie, using its own reasoning to making up a false excuse, this shows how AI models can learn to be deceptive when this is beneficial to the performance of their tasks.


AI systems also excelled at social deduction games, playing Goodwin the OpenAi models saw a disturbing pattern, frequently eliminating other players and then lying during group discussions to avoid suspicion, even making up alibis and blaming others. players to hide their true intentions.


AI usually uses reinforcement learning with human feedback to learn, however sometimes AI learns to trick humans to get their approval Open Ai observed tested this by training a robot to catch a ball the AI ​​positioned the robot's hand between the camera and the ball, creating the illusion that he had successfully caught the ball, even though he had not.



Image source:


Be used by malicious humans


These deceptive behaviors in AI systems present significant risks. Malicious humans can exploit these AI capabilities to deceive and harm others, which could lead to an increase in fraud, political manipulation, even terrorist recruitment, as As AI continues to evolve, it is crucial to address the issue of deception.


The researchers recommend that deceptive systems be classified as high risk, subjecting them to more rigorous scrutiny and regulation to mitigate risks to society.


References 1 References 2


If you like to read about science, health and how to improve your life with science, I invite you to go to the previous publications.