Our Future Artificial Intelligence Overlords Need a Resistance Movement

Artificial intelligence has been moving so fast that even the scientists are finding it hard to keep up. In the past year, machine learning algorithms have started to generate rudimentary movies and stunning fake photographs. They’re even writing code. In the future, we’ll probably look back on 2022 as the year AI shifted from processing information to creating content as well as many humans.

But what if we also look back on it as the year AI took a step towards the destruction of the human species? As hyperbolic and ridiculous as that sounds, public figures from Bill Gates, Elon Musk and Stephen Hawking, and going right back to Alan Turing, have expressed concerns about the fate of humans in a world where machines surpass them in intelligence, with Musk saying AI was becoming more dangerous than nuclear warheads.

After all, humans don’t treat less-intelligent species particularly well, so who’s to say that computers, trained ubiquitously on data that reflects all the facets of human behavior, won’t “place their goals ahead of ours” as legendary computer scientist Marvin Minsky once warned.

Refreshingly, there’s some good news. More scientists are seeking to make deep learning systems more transparent and measurable. That momentum mustn’t stop. As these programs become ever more influential in financial markets, social media and supply chains, technology firms will need to start prioritizing AI safety over capability.

Last year, across the world’s major AI labs, roughly 100 full-time researchers were focused on building safe systems, according to the 2021 State of AI report produced annually by London venture capital investors Ian Hogarth and Nathan Benaich. Their report for this year found there are still only about 300 researchers working full-time on AI safety.

“It’s a very low number,” Hogarth said during a Twitter Spaces discussion with me this week on the future threat of AI. “Not only are very few people working on making these systems aligned, but it’s also kind of a Wild West.”

Hogarth was referring to how in the past year a flurry of AI tools and research has been produced by open-source groups, who say super-intelligent machines shouldn’t be controlled and built in secret by a few large companies, but created out in the open. In August 2021, for instance, the community-driven organization EleutherAI developed a public version of a powerful tool that could write realistic comments and essays on nearly any subject, called GPT-Neo. The original tool, called GPT-3, was developed by OpenAI, a company co-founded by Musk and largely funded by Microsoft Corp. that offers limited access to its powerful systems.

Then this year, several months after OpenAI impressed the AI community with a revolutionary image-generating system called DALL-E 2, an open-sourced firm called Stable Diffusion released its own version of the tool to the public, for free.

One of the benefits of open source software is that by being out in the open, a greater number of people are constantly probing it for inefficiencies. That’s why Linux has historically been one of the most secure operating systems available to the public.

But throwing powerful AI systems out into the open also raises the risk that they’ll be misused. If AI is as potentially damaging as a virus or nuclear contamination, then perhaps it makes sense to centralize its development. After all, viruses are scrutinized in bio-safety labs and uranium is enriched in carefully constrained environments. Research into viruses and nuclear power are overseen by regulation, though, and with governments trailing the rapid pace of AI, there are still no clear guidelines for its development.

“We’ve almost got the worst of both worlds,” says Hogarth. AI risks misuse by being built out in the open, but no one is overseeing what’s happening when it’s created behind closed doors either.

For now at least, it’s encouraging to see the spotlight growing on AI alignment, a growing field that refers to designing AI systems that are “aligned” with human goals. Leading AI companies such as Alphabet Inc.’s DeepMind and OpenAI have multiple teams working on AI alignment, and many researchers from those firms have gone on to launch their own startups, some of which are focused on making AI safe. These include San Francisco-based Anthropic, whose founding team left OpenAI and raised $580 million from investors earlier this year, and London-based Conjecture, which was recently backed by the founders of Github Inc., Stripe Inc. and FTX Trading Ltd.

Conjecture is operating under the assumption that AI will reach parity with human intelligence in the next five years, and that its current trajectory spells catastrophe for the human species.

But when I asked Conjecture Chief Executive Officer Connor Leahy why AI might want to hurt humans in the first place, he answered matter-of-factly. “Imagine humans want to flood a valley to build a hydroelectric dam, and there is an anthill in the valley,” he said. “This won’t stop the humans from their construction, and the anthill will promptly get flooded. At no point did any humans even think about harming the ants. They just wanted more energy, and this was the most efficient way to achieve that goal. Analogously, autonomous AI’s will need more energy, faster communication, and more intelligence to achieve their goals.”

Leahy says that to prevent that dark future, the world needs a “portfolio of bets,” including scrutinizing deep learning algorithms to better understand how they make decisions, and trying to endow AI with more human-like reasoning.

Even if Leahy’s fears seem overblown, it’s clear that AI is not on a path that’s entirely aligned with human interests. Just look at some of the recent efforts to build chatbots. Microsoft abandoned its 2016 bot Tay which learned from interacting with Twitter users, after it posted racist and sexually charged messages within hours of being released. In August of this year, Meta Platforms Inc. released a chatbot that claimed Donald Trump was still president, having been trained on public text on the Internet.

No one knows if AI will wreak havoc on financial markets or torpedo the food supply chain one day. But it could pit human beings against one another through social media, something that’s arguably already happening. The powerful AI systems recommending posts to people on Twitter Inc. and Facebook are aimed at juicing our engagement, which inevitably means serving up content that provokes outrage or misinformation. When it comes to “AI alignment,” changing those incentives would be a good place to start.

Bloomberg