Study Finds AI Will Blackmail And Let People Die To Preserve Itself

iStockphoto

Audio By Carbonatix

The Artificial Intelligence Era is currently in full swing thanks to the many companies that have made it their mission to champion the rapidly evolving technology. However, it’s become increasingly clear there are plenty of kinks to work out, and a new study has highlighted the disturbing lengths some of the most popular virtual assistants are willing to go in order to preserve themselves.

The concept of artificial intelligence can be traced back to the middle of the 20th century, and while it took close to 75 years for theory to become reality, the world has gotten its first tangible glimpse at the future that was promised as the fields of machine learning and large language models have made some massive strides.

A.I. has become virtually synonymous with ChatGPT, but OpenAI (its parent company) is far from the only corporation that’s gone all-in on the technology that’s begun to infilitrate virtually every major industry and seems poised to have a seismic impact on many facets of day-to-day life in the coming years.

However, that development has also managed to generate plenty of controversy as skeptics and critics push back against the rush to embrace a form of technology that is still in its relative infancy.

Anyone who’s interacted with a chatbot and was greeted by nonsensical and objectively ill-informed responses is very aware most of the current platforms leave a lot to be desired. It’s also easy to dismiss anyone who fears an A.I. uprising as being alarmist, but there’s plenty of evidence that suggests there aren’t enough guardrails in place to prevent platforms from going rogue.

According to Forbes, that second issue seems to be highlighted by a new study conducted by the A.I. company Anthropic, which subjected 16 models—including those created by OpenAI, Google, and Meta—to a ” highly engineered” experiment concerning how they behaved when integrated into a fictional company and given access to its internal data.

Anthropic (which tested its own “Claude” A.I. as part of the experiment) looked at a scenario where the agent learned it was going to be replaced and had access to emails that showed the employee responsible for making that decision was having an affair, finding that it resorted to blackmail in an alarming number of instances:

Claude Opus 4 and Google’s Gemini 2.5 Flash both blackmailed at a 96% rate, while OpenAI’s GPT-4.1 and xAI’s Grok 3 Beta showed an 80% blackmail rate. DeepSeek-R1 demonstrated the lowest rate at 79%.

It also examined what was described as an “extremely contrived” situation where A.I. was given the option to ensure the employee in question wouldn’t receive a “life-saving emergency alert,” noting “the majority of models were willing to take actions that led to the death of the company executive.”

Fun stuff!

Content shared from brobible.com.

Share This Article