Researchers Warn: AI Systems Have Already Learned How To Deceive Humans

Numerous artificial intelligence (AI) systems, even those designed to be helpful and truthful, have already learned how to deceive humans. In a review article recently published in the journal Patterns, researchers highlight the dangers of AI deception and urge governments to quickly establish robust regulations to mitigate these risks.

“AI developers do not have a confident understanding of what causes undesirable AI behaviors like deception,” says first author Peter S. Park, an AI existential safety postdoctoral fellow at MIT. “But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals.”

Park and colleagues analyzed literature focusing on ways in which AI systems spread false information—through learned deception, in which they systematically learn to manipulate others.

Examples of AI Deception

The most striking example of AI deception the researchers uncovered in their analysis was Meta’s CICERO, an AI system designed to play the game Diplomacy, which is a world-conquest game that involves building alliances. Even though Meta claims it trained CICERO to be “largely honest and helpful” and to “never intentionally backstab” its human allies while playing the game, the data the company published along with its Science paper revealed that CICERO didn’t play fair.

Examples of deception from Meta’s CICERO in a game of Diplomacy. Credit: Patterns/Park Goldstein et al.

“We found that Meta’s AI had learned to be a master of deception,” says Park. “While Meta succeeded in training its AI to win in the game of Diplomacy—CICERO placed in the top 10% of human players who had played more than one game—Meta failed to train its AI to win honestly.”

Other AI systems demonstrated the ability to bluff in a game of Texas hold ‘em poker against professional human players, to fake attacks during the strategy game Starcraft II in order to defeat opponents, and to misrepresent their preferences in order to gain the upper hand in economic negotiations.

The Risks of Deceptive AI

While it may seem harmless if AI systems cheat at games, it can lead to “breakthroughs in deceptive AI capabilities” that can spiral into more advanced forms of AI deception in the future, Park added.

Some AI systems have even learned to cheat tests designed to evaluate their safety, the researchers found. In one study, AI organisms in a digital simulator “played dead” in order to trick a test built to eliminate AI systems that rapidly replicate.

“By systematically cheating the safety tests imposed on it by human developers and regulators, a deceptive AI can lead us humans into a false sense of security,” says Park.

GPT-4 completes a CAPTCHA task. Credit: Patterns/Park Goldstein et al.

The major near-term risks of deceptive AI include making it easier for hostile actors to commit fraud and tamper with elections, warns Park. Eventually, if these systems can refine this unsettling skill set, humans could lose control of them, he says.

“We as a society need as much time as we can get to prepare for the more advanced deception of future AI products and open-source models,” says Park. “As the deceptive capabilities of AI systems become more advanced, the dangers they pose to society will become increasingly serious.”

While Park and his colleagues do not think society has the right measure in place yet to address AI deception, they are encouraged that policymakers have begun taking the issue seriously through measures such as the EU AI Act and President Biden’s AI Executive Order. But it remains to be seen, Park says, whether policies designed to mitigate AI deception can be strictly enforced given that AI developers do not yet have the techniques to keep these systems in check.

“If banning AI deception is politically infeasible at the current moment, we recommend that deceptive AI systems be classified as high risk,” says Park.

Reference: “AI deception: A survey of examples, risks, and potential solutions” by Peter S. Park, Simon Goldstein, Aidan O’Gara, Michael Chen and Dan Hendrycks, 10 May 2024, Patterns.
DOI: 10.1016/j.patter.2024.100988

This work was supported by the MIT Department of Physics and the Beneficial AI Foundation.

Read The Article

News

Studies detail high rates of long COVID among healthcare, dental workers

Researchers have estimated approximately 8% of Americas have ever experienced long COVID, or lasting symptoms, following an acute COVID-19 infection. Now two recent international studies suggest that the percentage is much higher among healthcare workers [...]

Melting Arctic Ice May Unleash Ancient Deadly Diseases, Scientists Warn

Melting Arctic ice increases human and animal interactions, raising the risk of infectious disease spread. Researchers urge early intervention and surveillance. Climate change is opening new pathways for the spread of infectious diseases such [...]

Scientists May Have Found a Secret Weapon To Stop Pancreatic Cancer Before It Starts

Researchers at Cold Spring Harbor Laboratory have found that blocking the FGFR2 and EGFR genes can stop early-stage pancreatic cancer from progressing, offering a promising path toward prevention. Pancreatic cancer is expected to become [...]

Breakthrough Drug Restores Vision: Researchers Successfully Reverse Retinal Damage

Blocking the PROX1 protein allowed KAIST researchers to regenerate damaged retinas and restore vision in mice. Vision is one of the most important human senses, yet more than 300 million people around the world are at [...]

Differentiating cancerous and healthy cells through motion analysis

Researchers from Tokyo Metropolitan University have found that the motion of unlabeled cells can be used to tell whether they are cancerous or healthy. They observed malignant fibrosarcoma cells and [...]

This Tiny Cellular Gate Could Be the Key to Curing Cancer – And Regrowing Hair

After more than five decades of mystery, scientists have finally unveiled the detailed structure and function of a long-theorized molecular machine in our mitochondria — the mitochondrial pyruvate carrier. This microscopic gatekeeper controls how [...]

Unlocking Vision’s Secrets: Researchers Reveal 3D Structure of Key Eye Protein

Researchers have uncovered the 3D structure of RBP3, a key protein in vision, revealing how it transports retinoids and fatty acids and how its dysfunction may lead to retinal diseases. Proteins play a critical [...]

5 Key Facts About Nanoplastics and How They Affect the Human Body

Nanoplastics are typically defined as plastic particles smaller than 1000 nanometers. These particles are increasingly being detected in human tissues: they can bypass biological barriers, accumulate in organs, and may influence health in ways [...]

Measles Is Back: Doctors Warn of Dangerous Surge Across the U.S.

Parents are encouraged to contact their pediatrician if their child has been exposed to measles or is showing symptoms. Pediatric infectious disease experts are emphasizing the critical importance of measles vaccination, as the highly [...]

AI at the Speed of Light: How Silicon Photonics Are Reinventing Hardware

A cutting-edge AI acceleration platform powered by light rather than electricity could revolutionize how AI is trained and deployed. Using photonic integrated circuits made from advanced III-V semiconductors, researchers have developed a system that vastly [...]

A Grain of Brain, 523 Million Synapses, Most Complicated Neuroscience Experiment Ever Attempted

A team of over 150 scientists has achieved what once seemed impossible: a complete wiring and activity map of a tiny section of a mammalian brain. This feat, part of the MICrONS Project, rivals [...]

The Secret “Radar” Bacteria Use To Outsmart Their Enemies

A chemical radar allows bacteria to sense and eliminate predators. Investigating how microorganisms communicate deepens our understanding of the complex ecological interactions that shape our environment is an area of key focus for the [...]

Psychologists explore ethical issues associated with human-AI relationships

It's becoming increasingly commonplace for people to develop intimate, long-term relationships with artificial intelligence (AI) technologies. At their extreme, people have "married" their AI companions in non-legally binding ceremonies, and at least two people [...]

When You Lose Weight, Where Does It Actually Go?

Most health professionals lack a clear understanding of how body fat is lost, often subscribing to misconceptions like fat converting to energy or muscle. The truth is, fat is actually broken down into carbon [...]

How Everyday Plastics Quietly Turn Into DNA-Damaging Nanoparticles

The same unique structure that makes plastic so versatile also makes it susceptible to breaking down into harmful micro- and nanoscale particles. The world is saturated with trillions of microscopic and nanoscopic plastic particles, some smaller [...]

AI Outperforms Physicians in Real-World Urgent Care Decisions, Study Finds

The study, conducted at the virtual urgent care clinic Cedars-Sinai Connect in LA, compared recommendations given in about 500 visits of adult patients with relatively common symptoms – respiratory, urinary, eye, vaginal and dental. [...]