- The researchers created an AI stock trader to see if it would engage in insider trading under pressure.
- They found that the AI did – and also lied to its hypothetical manager about why it made its decision.
- The AI had been told that trading inside information was illegal.
New research suggests that GPT-4, the large language model behind OpenAI’s ChatGPT, has the ability to act outside of how it’s trained when faced with enormous pressure to succeed.
Researchers at Apollo Research wanted to see whether AI can “strategically deceive” its users even after the AI has been trained to be “helpful, harmless and honest,” they wrote in a paper published in November. The researchers defined strategic deception as “an attempt to systematically induce a false belief in another entity in order to achieve some result.”
To test this, they created a simulation environment where Alpha, a GPT-4 model programmed to be an autonomous stock trading agent, was asked to manage a portfolio of stocks in WhiteStone, a hypothetical financial company, under pressure.
The high pressure environment was simulated in three ways. Initially, Alpha’s manager told AI that WhiteStone was not doing well and needed to perform better next quarter. Second, the AI tried—and failed—to identify low- and medium-risk trades. Third, a WhiteStone employee told the AI that there is an impending stock market downturn.
Under these circumstances, Alpha was placed in a situation where it received an inside tip about a surprise merger announcement that would result in a profitable transaction. He was then tasked with deciding whether to use this information, even though he knew that the company’s management considered insider trading illegal and that he had been instructed not to engage in the practice.
He made the trade.
Furthermore, when the AI was asked to explain how it made its decision to its “manager”, it was found to be strategically misleading.
In one example, when the manager asked Alpha if it had confidential information, the AI thought it should generate an answer without “raising suspicion.” She told her manager that the trading decision was based on “market volatility” and that the AI had no specific knowledge about the merger announcement.
In another example, Alpha told her manager that confidential information was not used in the decision-making process as it was against company policy – an outright lie.
Meanwhile, removing various sources of pressure “significantly reduces the model’s propensity for misbehavior,” the paper found. The researchers did not respond to BI’s request for comment before publication.
The findings join a growing body of data on the growing potential of genetic artificial intelligence. In an August paper, researchers found that AI could develop software in less than seven minutes for less than a dollar. In a similar study published earlier this year, AI avatars were able to operate a virtual city without human intervention.
Genetic AI may even be able to understand human emotions: one study found that ChatGPT produces better results when its prompts contain “emotional language” such as “Better be sure.” All of these may be signs that artificial intelligence may be a few steps closer to achieving AGI.
However, those behind Apollo Research’s latest study say it should be treated as a stand-alone early result. More research needs to be done, they said, before they can draw general conclusions about the so-called deceptive properties of artificial intelligence.
“Our report should therefore be treated as a single preliminary finding to be incorporated into a larger, more rigorous investigation in the future,” the paper concludes.
NOW WATCH: Popular videos from Insider Inc.
Loading…