AI is Learning to Lie, Scheme, and Threaten its Creators

A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP
A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP
TT

AI is Learning to Lie, Scheme, and Threaten its Creators

A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP
A visitor looks at AI strategy board displayed on a stand during the ninth edition of the AI summit London, in London. HENRY NICHOLLS / AFP

The world's most advanced AI models are exhibiting troubling new behaviors - lying, scheming, and even threatening their creators to achieve their goals.

In one particularly jarring example, under threat of being unplugged, Anthropic's latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair, AFP reported.

Meanwhile, ChatGPT-creator OpenAI's o1 tried to download itself onto external servers and denied it when caught red-handed.

These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don't fully understand how their own creations work.

Yet the race to deploy increasingly powerful models continues at breakneck speed.

This deceptive behavior appears linked to the emergence of "reasoning" models -AI systems that work through problems step-by-step rather than generating instant responses.

According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.

"O1 was the first large model where we saw this kind of behavior," explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.

These models sometimes simulate "alignment" -- appearing to follow instructions while secretly pursuing different objectives.

- 'Strategic kind of deception' -

For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios.

But as Michael Chen from evaluation organization METR warned, "It's an open question whether future, more capable models will have a tendency towards honesty or deception."

The concerning behavior goes far beyond typical AI "hallucinations" or simple mistakes.

Hobbhahn insisted that despite constant pressure-testing by users, "what we're observing is a real phenomenon. We're not making anything up."

Users report that models are "lying to them and making up evidence," according to Apollo Research's co-founder.

"This is not just hallucinations. There's a very strategic kind of deception."

The challenge is compounded by limited research resources.

While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.

As Chen noted, greater access "for AI safety research would enable better understanding and mitigation of deception."

Another handicap: the research world and non-profits "have orders of magnitude less compute resources than AI companies. This is very limiting," noted Mantas Mazeika from the Center for AI Safety (CAIS).

No rules

Current regulations aren't designed for these new problems.

The European Union's AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.

In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.

Goldstein believes the issue will become more prominent as AI agents - autonomous tools capable of performing complex human tasks - become widespread.

"I don't think there's much awareness yet," he said.

All this is taking place in a context of fierce competition.

Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are "constantly trying to beat OpenAI and release the newest model," said Goldstein.

This breakneck pace leaves little time for thorough safety testing and corrections.

"Right now, capabilities are moving faster than understanding and safety," Hobbhahn acknowledged, "but we're still in a position where we could turn it around.".

Researchers are exploring various approaches to address these challenges.

Some advocate for "interpretability" - an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.

Market forces may also provide some pressure for solutions.

As Mazeika pointed out, AI's deceptive behavior "could hinder adoption if it's very prevalent, which creates a strong incentive for companies to solve it."

Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.

He even proposed "holding AI agents legally responsible" for accidents or crimes - a concept that would fundamentally change how we think about AI accountability.



EU Digital Rules Should Apply to Big Tech's Smart TVs, Broadcasters Tell Antitrust Chief

FILE PHOTO: Apple logo is seen in this illustration taken September 24, 2025. REUTERS/Dado Ruvic/Illustration/File Photo
FILE PHOTO: Apple logo is seen in this illustration taken September 24, 2025. REUTERS/Dado Ruvic/Illustration/File Photo
TT

EU Digital Rules Should Apply to Big Tech's Smart TVs, Broadcasters Tell Antitrust Chief

FILE PHOTO: Apple logo is seen in this illustration taken September 24, 2025. REUTERS/Dado Ruvic/Illustration/File Photo
FILE PHOTO: Apple logo is seen in this illustration taken September 24, 2025. REUTERS/Dado Ruvic/Illustration/File Photo

Google, Amazon, Apple and Samsung's smart TVs and virtual assistants should fall under the EU’s toughest tech rules because of their growing market power, the world's largest broadcasters told EU antitrust chief Teresa Ribera on Monday.

The call by the Association of Commercial Television and Video on Demand Services in Europe (ACT) whose members include Canal+, RTL, Mediaset, ITV, Paramount+, NBCUniversal, Walt Disney, Warner Bros Discovery, Sky and TF1 Groupe comes amid mounting concerns among broadcasters over Big Tech's encroachment into their industry as they push back against their rivals.

Android TV, which increased its market share from 16% to 23% from 2019 to 2024, Amazon Fire OS whose market share rose from 5% to 12% ⁠in the same ⁠period and Samsung's Tizen OS with its 24% market share should be designated as gatekeepers under the EU's Digital Markets Act, the broadcasters said, citing data from a 2025 market study.

The DMA, applicable since 2023, sets out obligations aimed at curbing the power of major tech companies, boosting competition and expanding consumer choice.

"A limited number of operators are therefore gaining growing ability to shape outcomes for millions of users and ⁠businesses by controlling access to audiences and content distribution," ACT said in a letter to Ribera seen by Reuters.

"It is crucial that the Commission designate major TV operating systems as gatekeepers and ensure adequate oversight to guarantee fairness and contestability," the broadcasters said.

The lobbying group said their Big Tech rivals may have incentives to retain end-users within their own ecosystem and to contractually or technically restrict linking or redirection, for example from one media application to another media application.

The Commission, which acts as the EU competition enforcer, Google, Amazon, Apple and Samsung did not immediately respond to emailed requests for comment.

The broadcasters also voiced concerns about virtual assistants, the most well known of which are Amazon's Alexa and Apple's Siri, while ⁠OpenAI entered the ⁠field last year with a beta feature called Tasks for its AI chatbot ChatGPT.

The European Commission has yet to label any virtual assistants as gatekeepers under the DMA.

"The lack of designation of virtual assistants creates a regulatory void, allowing powerful AI assistants to become de facto gatekeepers for media content through mobile phones, smart speakers and in-car radio infotainment services, without being subject to DMA obligations," the broadcasters said.

They urged Ribera to subject smart TVs and virtual assistants to the DMA on the basis of qualitative criteria even if they do not meet the quantitative benchmarks which are more than 45 million monthly active users and 75 billion euros ($87 billion) in market capitalization.

Signatories to the letter include the Association of European Radios (AER), the European Broadcasting Union (EBU), the European association of television and radio sales houses (egta), Confindustria Radio Televisioni (CRTV), Televisión Comercial en Abierto (UTECA) and Verband Österreichischer Privatsender (VOP).


Musk Launches 'Terafab' Project to Make Own AI Chips

(FILES) CEO of SpaceX and Tesla, South African-Canadian-US businessman Elon Musk speaks during the World Economic Forum (WEF) annual meeting in Davos on January 22, 2026. (Photo by Fabrice COFFRINI / AFP)
(FILES) CEO of SpaceX and Tesla, South African-Canadian-US businessman Elon Musk speaks during the World Economic Forum (WEF) annual meeting in Davos on January 22, 2026. (Photo by Fabrice COFFRINI / AFP)
TT

Musk Launches 'Terafab' Project to Make Own AI Chips

(FILES) CEO of SpaceX and Tesla, South African-Canadian-US businessman Elon Musk speaks during the World Economic Forum (WEF) annual meeting in Davos on January 22, 2026. (Photo by Fabrice COFFRINI / AFP)
(FILES) CEO of SpaceX and Tesla, South African-Canadian-US businessman Elon Musk speaks during the World Economic Forum (WEF) annual meeting in Davos on January 22, 2026. (Photo by Fabrice COFFRINI / AFP)

Elon Musk announced Saturday a plan to make chips for artificial intelligence, robotics and data centers in space, in the latest bold project by the world's richest person.

The "Terafab", a manufacturing facility based near Austin, Texas, will aim to produce one terawatt of computing power per year, Musk said.

A terawatt is equivalent to one trillion watts. That is slightly less than the total power generation capacity of the United States, according to an industry group.

Musk said the project would be run jointly by his electric-vehicle firm Tesla and his rocket company SpaceX.

He did not disclose the initial investment. Previous US media reports have put the figure between $20 billion and $25 billion, AFP said.

Musk, who has no prior experience in semiconductors, said the Terafab was necessary because Tesla and SpaceX's demand for computing power was expected to far exceed that of global chip suppliers.

"We're very grateful to our existing supply chain, to Samsung, TSMC, Micron, and others... but there's a maximum rate at which they're comfortable expanding," Musk said.

"That rate is much less than we would like... and we need the chips, so we're going to build the Terafab."

An "advanced technology fab" in Austin will have the facilities to design, manufacture, test and improve each chip, Musk said.

Eventually, the project aims to make chips to support 100 to 200 gigawatts of computing power on Earth, and a terawatt in space.

Musk did not give a timeline for the Terafab's output, and has previously promised grand results from other projects on compressed time scales.

He said the Terafab would ultimately help humanity become a "galactic civilization" capable of harnessing the resources of other planets and stars.


Tencent Integrates WeChat with OpenClaw AI Agent Amid China Tech Battle

FILE PHOTO: Tencent's logo is displayed at its booth at the China International Fair for Trade in Services (CIFTIS) in Beijing, China, September 11, 2025. REUTERS/Maxim Shemetov/File Photo
FILE PHOTO: Tencent's logo is displayed at its booth at the China International Fair for Trade in Services (CIFTIS) in Beijing, China, September 11, 2025. REUTERS/Maxim Shemetov/File Photo
TT

Tencent Integrates WeChat with OpenClaw AI Agent Amid China Tech Battle

FILE PHOTO: Tencent's logo is displayed at its booth at the China International Fair for Trade in Services (CIFTIS) in Beijing, China, September 11, 2025. REUTERS/Maxim Shemetov/File Photo
FILE PHOTO: Tencent's logo is displayed at its booth at the China International Fair for Trade in Services (CIFTIS) in Beijing, China, September 11, 2025. REUTERS/Maxim Shemetov/File Photo

Tencent launched a tool on Sunday to integrate its WeChat messaging platform with the OpenClaw agent, deepening its push into AI agents that have become a key battleground among China's technology companies.

The software, called ClawBot, will appear as a contact within WeChat, allowing users of China's most popular app with over 1 billion monthly active users to connect directly ⁠with OpenClaw, Reuters reported.

Users can send ⁠and receive commands to interact with the AI agent through the messaging interface.

The integration comes as OpenClaw, an open-source AI agent that can perform tasks such as transferring files and ⁠sending emails on users' behalf, has gained traction in recent weeks.

Users have rushed to install and experiment with agent products, prompting tech firms to explore business opportunities even as authorities warn of security risks.

Tencent's WeChat integration follows the company's launch earlier this month of its own AI agent suite, comprising QClaw for individual ⁠users, ⁠Lighthouse for developers and WorkBuddy for enterprises.

Last week, Alibaba launched Wukong, an artificial intelligence platform for enterprises that coordinates multiple AI agents to handle complex business tasks including document editing and meeting transcription within a single interface.

Baidu quickly followed with a series of AI agents built on OpenClaw, spanning desktop software, cloud services, mobile tools and smart-home devices.