AI Experts Ready ‘Humanity’s Last Exam’ to Stump Powerful Tech

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)

18:07-16 September 2024 AD ـ 12 Rabi’ Al-Awwal 1446 AH

AI Experts Ready ‘Humanity’s Last Exam’ to Stump Powerful Tech

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)

A team of technology experts issued a global call on Monday seeking the toughest questions to pose to artificial intelligence systems, which increasingly have handled popular benchmark tests like child's play.

Dubbed "Humanity's Last Exam," the project seeks to determine when expert-level AI has arrived. It aims to stay relevant even as capabilities advance in future years, according to the organizers, a non-profit called the Center for AI Safety (CAIS) and the startup Scale AI.

The call comes days after the maker of ChatGPT previewed a new model, known as OpenAI o1, which "destroyed the most popular reasoning benchmarks," said Dan Hendrycks, executive director of CAIS and an advisor to Elon Musk's xAI startup.

Hendrycks co-authored two 2021 papers that proposed tests of AI systems that are now widely used, one quizzing them on undergraduate-level knowledge of topics like US history, the other probing models' ability to reason through competition-level math. The undergraduate-style test has more downloads from the online AI hub Hugging Face than any such dataset.

At the time of those papers, AI was giving almost random answers to questions on the exams. "They're now crushed," Hendrycks told Reuters.

As one example, the Claude models from the AI lab Anthropic have gone from scoring about 77% on the undergraduate-level test in 2023, to nearly 89% a year later, according to a prominent capabilities leaderboard.

These common benchmarks have less meaning as a result.

AI has appeared to score poorly on lesser-used tests involving plan formulation and visual pattern-recognition puzzles, according to Stanford University’s AI Index Report from April. OpenAI o1 scored around 21% on one version of the pattern-recognition ARC-AGI test, for instance, the ARC organizers said on Friday.

Some AI researchers argue that results like this show planning and abstract reasoning to be better measures of intelligence, though Hendrycks said the visual aspect of ARC makes it less suited to assessing language models. "Humanity’s Last Exam" will require abstract reasoning, he said.

Answers from common benchmarks may also have ended up in data used to train AI systems, industry observers have said. Hendrycks said some questions on "Humanity's Last Exam" will remain private to make sure AI systems' answers are not from memorization.

The exam will include at least 1,000 crowd-sourced questions due November 1 that are hard for non-experts to answer. These will undergo peer review, with winning submissions offered co-authorship and up to $5,000 prizes sponsored by Scale AI.

"We desperately need harder tests for expert-level models to measure the rapid progress of AI," said Alexandr Wang, Scale's CEO.

One restriction: the organizers want no questions about weapons, which some say would be too dangerous for AI to study.

Google Says to Build New Subsea Cables from India in AI Push

A logo of Google is on display at Bharat Mandapam, one of the venues for AI Impact Summit, in New Delhi, India, February 17, 2026. REUTERS/Bhawika Chhabra

Asharq Al Awsat

11:33-18 February 2026 AD ـ 01 Ramadan 1447 AH

Asharq Al Awsat

11:33-18 February 2026 AD ـ 01 Ramadan 1447 AH

Google Says to Build New Subsea Cables from India in AI Push

A logo of Google is on display at Bharat Mandapam, one of the venues for AI Impact Summit, in New Delhi, India, February 17, 2026. REUTERS/Bhawika Chhabra

Google announced Wednesday it would build new subsea cables from India and other locations as part of its existing $15 billion investment in the South Asian nation, which is hosting a major artificial intelligence summit this week.

The US tech giant said it would build "three subsea paths connecting India to Singapore, South Africa, and Australia; and four strategic fiber-optic routes that bolster network resilience and capacity between the United States, India, and multiple locations across the Southern Hemisphere".

Technology

Mark Zuckerberg Set to Testify in Watershed Social Media Trial

Meta's CEO Mark Zuckerberg testifies during the Senate Judiciary Committee hearing on online child sexual exploitation at the US Capitol in Washington, US, January 31, 2024. (Reuters)

09:41-18 February 2026 AD ـ 01 Ramadan 1447 AH

Mark Zuckerberg Set to Testify in Watershed Social Media Trial

Meta's CEO Mark Zuckerberg testifies during the Senate Judiciary Committee hearing on online child sexual exploitation at the US Capitol in Washington, US, January 31, 2024. (Reuters)

Mark Zuckerberg will testify in an unprecedented social media trial that questions whether Meta's platforms deliberately addict and harm children.

Meta's CEO is expected to answer tough questions on Wednesday from attorneys representing a now 20-year-old woman identified by the initials KGM, who claims her early use of social media addicted her to the technology and exacerbated depression and suicidal thoughts. Meta Platforms and Google’s YouTube are the two remaining defendants in the case, which TikTok and Snap have settled.

Zuckerberg has testified in other trials and answered questions from Congress about youth safety on Meta's platforms, and he apologized to families at that hearing whose lives had been upended by tragedies they believed were because of social media.

This trial, though, marks the first time Zuckerberg will answer similar questions in front of a jury. and, again, bereaved parents are expected to be in the limited courtroom seats available to the public.

The case, along with two others, has been selected as a bellwether trial, meaning its outcome could impact how thousands of similar lawsuits against social media companies would play out.

A Meta spokesperson said the company strongly disagrees with the allegations in the lawsuit and said they are “confident the evidence will show our longstanding commitment to supporting young people.”

One of Meta's attorneys, Paul Schmidt, said in his opening statement that the company is not disputing that KGM experienced mental health struggles, but rather that Instagram played a substantial factor in those struggles.

He pointed to medical records that showed a turbulent home life, and both he and an attorney representing YouTube argue she turned to their platforms as a coping mechanism or a means of escaping her mental health struggles.

Zuckerberg's testimony comes a week after that of Adam Mosseri, the head of Meta's Instagram, who said in the courtroom that he disagrees with the idea that people can be clinically addicted to social media platforms.

Mosseri maintained that Instagram works hard to protect young people using the service, and said it's “not good for the company, over the long run, to make decisions that profit for us but are poor for people’s well-being."

Much of Mosseri's questioning from the plaintiff's lawyer, Mark Lanier, centered on cosmetic filters on Instagram that changed people’s appearance — a topic that Lanier is sure to revisit with Zuckerberg.

He is also expected to face questions about Instagram’s algorithm, the infinite nature of Meta’ feeds and other features the plaintiffs argue are designed to get users hooked.

Technology

US Tech Giant Nvidia Announces India Deals at AI Summit

FILED - 04 February 2026, Bavaria, Munich: The NVIDIA logo is seen during a press conference at the opening of Telekom and NVIDIA's AI factory "Industrial AI Cloud". Photo: Sven Hoppe/dpa

Asharq Al Awsat

09:33-18 February 2026 AD ـ 01 Ramadan 1447 AH

Asharq Al Awsat

09:33-18 February 2026 AD ـ 01 Ramadan 1447 AH

US Tech Giant Nvidia Announces India Deals at AI Summit

FILED - 04 February 2026, Bavaria, Munich: The NVIDIA logo is seen during a press conference at the opening of Telekom and NVIDIA's AI factory "Industrial AI Cloud". Photo: Sven Hoppe/dpa

US artificial intelligence chip titan Nvidia unveiled tie-ups with Indian computing firms on Wednesday as tech companies rushed to announce deals and investments at a global AI conference in New Delhi.

This week's AI Impact Summit is the fourth annual gathering to discuss how to govern the fast-evolving technology -- and also an opportunity to "define India's leadership in the AI decade ahead", organizers say.

Mumbai cloud and data center provider L&T said it was teaming up with Nvidia, the world's most valuable company, to build what it touted as "India's largest gigawatt-scale AI factory".

"We are laying the foundation for world-class AI infrastructure that will power India's growth," said Nvidia boss Jensen Huang in a statement that did not put a figure on the investment.

L&T said it would use Nvidia's powerful processors, which can train and run generative AI tech, to provide data center capacity of up to 30 megawatts in Chennai and 40 megawatts in Mumbai.

Nvidia said it was also working with other Indian AI infrastructure players such as Yotta, which will deploy more than 20,000 top-end Nvidia Blackwell processors as part of a $2 billion investment.

Dozens of world leaders and ministerial delegations have come to India for the summit to discuss the opportunities and threats, from job losses to misinformation, that AI poses.

Last year India leapt to third place -- overtaking South Korea and Japan -- in an annual global ranking of AI competitiveness calculated by Stanford University researchers.

But despite plans for large-scale infrastructure and grand ambitions for innovation, experts say the country has a long way to go before it can rival the United States and China.

The conference has also brought a flurry of deals, with IT minister Ashwini Vaishnaw saying Tuesday that India expects more than $200 billion in investments over the next two years, including roughly $90 billion already committed.

Separately, India's Adani Group said Tuesday it plans to invest $100 billion by 2035 to develop "hyperscale AI-ready data centers", a boost to New Delhi's push to become a global AI hub.

Microsoft said it was investing $50 billion this decade to boost AI adoption in developing countries, while US artificial intelligence startup Anthropic and Indian IT giant Infosys said they would work together to build AI agents for the telecoms industry.

Nvidia's Huang is not attending the AI summit but other top US tech figures joining include OpenAI's Sam Altman, Google DeepMind's Demis Hassabis and Microsoft founder Bill Gates.

Indian Prime Minister Narendra Modi and other world leaders including French President Emmanuel Macron and Brazil's Luiz Inacio Lula da Silva are expected to deliver a statement at the end of the week about how they plan to address concerns raised by AI technology.

But experts say that the broad focus of the event and vague promises made at previous global AI summits in France, South Korea and Britain mean that concrete commitments are unlikely.

Nick Patience, practice lead for AI at tech research group Futurum, told AFP that nonbinding declarations could still "set the tone for what acceptable AI governance looks like".

But "the largest AI companies deploy capabilities at a pace that makes 18-month legislative cycles look glacial," Patience said.

"So it's a case of whether governments can converge fast enough to create meaningful guardrails before de facto standards are set by the companies themselves."

AI Experts Ready ‘Humanity’s Last Exam’ to Stump Powerful Tech

AI Experts Ready ‘Humanity’s Last Exam’ to Stump Powerful Tech

Most Viewed

Google Says to Build New Subsea Cables from India in AI Push

Google Says to Build New Subsea Cables from India in AI Push

Mark Zuckerberg Set to Testify in Watershed Social Media Trial

Mark Zuckerberg Set to Testify in Watershed Social Media Trial

US Tech Giant Nvidia Announces India Deals at AI Summit

US Tech Giant Nvidia Announces India Deals at AI Summit

لم تشترك بعد