AI is Learning to Lie, Scheme, and Threaten its Creators

A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP
A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP
TT

AI is Learning to Lie, Scheme, and Threaten its Creators

A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP
A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP

The world's most advanced AI models are exhibiting troubling new behaviors - lying, scheming, and even threatening their creators to achieve their goals.

In one particularly jarring example, under threat of being unplugged, Anthropic's latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair, AFP reported.

Meanwhile, ChatGPT-creator OpenAI's o1 tried to download itself onto external servers and denied it when caught red-handed.

These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don't fully understand how their own creations work.

Yet the race to deploy increasingly powerful models continues at breakneck speed.

This deceptive behavior appears linked to the emergence of "reasoning" models -AI systems that work through problems step-by-step rather than generating instant responses.

According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.

"O1 was the first large model where we saw this kind of behavior," explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.

These models sometimes simulate "alignment" -- appearing to follow instructions while secretly pursuing different objectives.

- 'Strategic kind of deception' -

For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios.

But as Michael Chen from evaluation organization METR warned, "It's an open question whether future, more capable models will have a tendency towards honesty or deception."

The concerning behavior goes far beyond typical AI "hallucinations" or simple mistakes.

Hobbhahn insisted that despite constant pressure-testing by users, "what we're observing is a real phenomenon. We're not making anything up."

Users report that models are "lying to them and making up evidence," according to Apollo Research's co-founder.

"This is not just hallucinations. There's a very strategic kind of deception."

The challenge is compounded by limited research resources.

While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.

As Chen noted, greater access "for AI safety research would enable better understanding and mitigation of deception."

Another handicap: the research world and non-profits "have orders of magnitude less compute resources than AI companies. This is very limiting," noted Mantas Mazeika from the Center for AI Safety (CAIS).

No rules

Current regulations aren't designed for these new problems.

The European Union's AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.

In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.

Goldstein believes the issue will become more prominent as AI agents - autonomous tools capable of performing complex human tasks - become widespread.

"I don't think there's much awareness yet," he said.

All this is taking place in a context of fierce competition.

Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are "constantly trying to beat OpenAI and release the newest model," said Goldstein.

This breakneck pace leaves little time for thorough safety testing and corrections.

"Right now, capabilities are moving faster than understanding and safety," Hobbhahn acknowledged, "but we're still in a position where we could turn it around.".

Researchers are exploring various approaches to address these challenges.

Some advocate for "interpretability" - an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.

Market forces may also provide some pressure for solutions.

As Mazeika pointed out, AI's deceptive behavior "could hinder adoption if it's very prevalent, which creates a strong incentive for companies to solve it."

Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.

He even proposed "holding AI agents legally responsible" for accidents or crimes - a concept that would fundamentally change how we think about AI accountability.



Meta Criticizes EU Antitrust Move Against WhatsApp Block on AI Rivals

(FILES) This illustration photograph taken on December 1, 2025, shows the logo of WhatsApp displayed on a smartphone's screen, in Frankfurt am Main, western Germany. (Photo by Kirill KUDRYAVTSEV / AFP)
(FILES) This illustration photograph taken on December 1, 2025, shows the logo of WhatsApp displayed on a smartphone's screen, in Frankfurt am Main, western Germany. (Photo by Kirill KUDRYAVTSEV / AFP)
TT

Meta Criticizes EU Antitrust Move Against WhatsApp Block on AI Rivals

(FILES) This illustration photograph taken on December 1, 2025, shows the logo of WhatsApp displayed on a smartphone's screen, in Frankfurt am Main, western Germany. (Photo by Kirill KUDRYAVTSEV / AFP)
(FILES) This illustration photograph taken on December 1, 2025, shows the logo of WhatsApp displayed on a smartphone's screen, in Frankfurt am Main, western Germany. (Photo by Kirill KUDRYAVTSEV / AFP)

Meta Platforms on Monday criticized EU regulators after they charged the US tech giant with breaching antitrust rules and threaten to halt its block on ⁠AI rivals on its messaging service WhatsApp.

"The facts are that there is no reason for ⁠the EU to intervene in the WhatsApp Business API. There are many AI options and people can use them from app stores, operating systems, devices, websites, and ⁠industry partnerships," a Meta spokesperson said in an email.

"The Commission's logic incorrectly assumes the WhatsApp Business API is a key distribution channel for these chatbots."


Chinese Robot Makers Ready for Lunar New Year Entertainment Spotlight

A folk performer breathes fire during a performance ahead of Lunar New Year celebrations in a village in Huai'an, in China's eastern Jiangsu Province on February 7, 2026. (AFP)
A folk performer breathes fire during a performance ahead of Lunar New Year celebrations in a village in Huai'an, in China's eastern Jiangsu Province on February 7, 2026. (AFP)
TT

Chinese Robot Makers Ready for Lunar New Year Entertainment Spotlight

A folk performer breathes fire during a performance ahead of Lunar New Year celebrations in a village in Huai'an, in China's eastern Jiangsu Province on February 7, 2026. (AFP)
A folk performer breathes fire during a performance ahead of Lunar New Year celebrations in a village in Huai'an, in China's eastern Jiangsu Province on February 7, 2026. (AFP)

In China, humanoid robots are serving as Lunar New Year entertainment, with their manufacturers pitching their song-and-dance skills to the general public as well as potential customers, investors and government officials.

On Sunday, Shanghai-based robotics start-up Agibot live-streamed an almost hour-long variety show featuring its robots dancing, performing acrobatics and magic, lip-syncing ballads and performing in comedy sketches. Other Agibot humanoid robots waved from an audience section.

An estimated 1.4 million people watched on the Chinese streaming platform Douyin. Agibot, which called the promotional stunt "the world's first robot-powered gala," did not have an immediate estimate for total viewership.

The ‌show ran a ‌week ahead of China's annual Spring Festival gala ‌to ⁠be aired ‌by state television, an event that has become an important - if unlikely - venue for Chinese robot makers to show off their success.

A squad of 16 full-size humanoids from Unitree joined human dancers in performing at China Central Television's 2025 gala, drawing stunned accolades from millions of viewers.

Less than three weeks later, Unitree's founder was invited to a high-profile symposium chaired by Chinese President Xi Jinping. The Hangzhou-based robotics ⁠firm has since been preparing for a potential initial public offering.

This year's CCTV gala will include ‌participation by four humanoid robot startups, Unitree, Galbot, Noetix ‍and MagicLab, the companies and broadcaster ‍have said.

Agibot's gala employed over 200 robots. It was streamed on social ‍media platforms RedNote, Sina Weibo, TikTok and its Chinese version Douyin. Chinese-language television networks HTTV and iCiTi TV also broadcast the performance.

"When robots begin to understand Lunar New Year and begin to have a sense of humor, the human-computer interaction may come faster than we think," Ma Hongyun, a photographer and writer with 4.8 million followers on Weibo, said in a post.

Agibot, which says ⁠its humanoid robots are designed for a range of applications, including in education, entertainment and factories, plans to launch an initial public offering in Hong Kong, Reuters has reported.

State-run Securities Times said Agibot had opted out of the CCTV gala in order to focus spending on research and development. The company did not respond to a request for comment.

The company demonstrated two of its robots to Xi during a visit in April last year.

US billionaire Elon Musk, who has pivoted automaker Tesla toward a focus on artificial intelligence and the Optimus humanoid robot, has said the only competitive threat he faces in robotics is from Chinese firms.


AI to Track Icebergs Adrift at Sea in Boon for Science

© Jonathan NACKSTRAND / AFP
© Jonathan NACKSTRAND / AFP
TT

AI to Track Icebergs Adrift at Sea in Boon for Science

© Jonathan NACKSTRAND / AFP
© Jonathan NACKSTRAND / AFP

British scientists said Thursday that a world-first AI tool to catalogue and track icebergs as they break apart into smaller chunks could fill a "major blind spot" in predicting climate change.

Icebergs release enormous volumes of freshwater when they melt on the open water, affecting global climate patterns and altering ocean currents and ecosystems, reported AFP.

But scientists have long struggled to keep track of these floating behemoths once they break into thousands of smaller chunks, their fate and impact on the climate largely lost to the seas.

To fill in the gap, the British Antarctic Survey has developed an AI system that automatically identifies and names individual icebergs at birth and tracks their sometimes decades-long journey to a watery grave.

Using satellite images, the tool captures the distinct shape of icebergs as they break off -- or calve -- from glaciers and ice sheets on land.

As they disintegrate over time, the machine performs a giant puzzle problem, linking the smaller "child" fragments back to the "parent" and creating detailed family trees never before possible at this scale.

It represents a huge improvement on existing methods, where scientists pore over satellite images to visually identify and track only the largest icebergs one by one.

The AI system, which was tested using satellite observations over Greenland, provides "vital new information" for scientists and improves predictions about the future climate, said the British Antarctic Survey.

Knowing where these giant slabs of freshwater were melting into the ocean was especially crucial with ice loss expected to increase in a warming world, it added.

"What's exciting is that this finally gives us the observations we've been missing," Ben Evans, a machine learning expert at the British Antarctic Survey, said in a statement.

"We've gone from tracking a few famous icebergs to building full family trees. For the first time, we can see where each fragment came from, where it goes and why that matters for the climate."

This use of AI could also be adapted to aid safe passage for navigators through treacherous polar regions littered by icebergs.

Iceberg calving is a natural process. But scientists say the rate at which they were being lost from Antarctica is increasing, probably because of human-induced climate change.