From Swahili to Zulu, African Techies Develop AI Language Tools

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)
TT

From Swahili to Zulu, African Techies Develop AI Language Tools

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)

When the Nigerian government announced plans in April to develop a multilingual AI tool to boost digital inclusion across the West African nation, 28-year-old computer science student Lwasinam Lenham Dilli was thrilled.

Dilli had struggled to scrape datasets from the internet to build a large language model (LLM), used to power AI chatbots, in his native Hausa language as part of his final-year project at university.

"I needed texts in English and their corresponding translation in Hausa but I couldn't get anything online, (there was) no clean data," Dilli told the Thomson Reuters Foundation.

"(Creating local language LLMs) is a way to ensure that our local dialects and languages will not be forgotten or left out of the AI ecosystem," he added.

The world has been swept up in a whirlwind of AI mania, with tools such as OpenAI's ChatGPT, Meta's Llama 2, and Mistral AI captivating millions globally with their ability to generate human-like text.

But for many tech-savvy Africans, the excitement has been tempered by a frustrating reality: when languages like Hausa, Amharic, or Kinyarwanda are entered into the chat, many of these advanced systems falter, often producing nonsensical responses.

Technology experts warn the lack of LLMs in African languages will lead to the exclusion of millions of people on the continent, increasing both the digital and economic divide.

The Nigerian government-led initiative to develop a multilingual LLM aims to level the playing field.

"The LLM will be trained on five low-resource languages and accented English to ensure stronger language representation ... for development of artificial intelligence solutions," said Nigeria's Digital Economy Minister Bosun Tijani in April.

The government will partner with Nigerian AI startups, and local data will be collected by volunteers who are fluent in any of five Nigerian languages: Yoruba, Hausa, Igbo, Ibibio, and West African lingua franca—Pidgin.

To build the model, the project will also draw on the expertise of more than 7,000 fellows from Nigeria's tech talent program - a government scheme to train three million people in skills such as coding and programming.

Silas Adekunle, co-founder of Awarri, an AI startup that is part of the initiative, said building a nuanced AI tool that understood Nigeria's unique language and cultural landscape presented many challenges.

"We have so many different accents and languages, and this (LLM) will enable many people and developers to build products that leverage AI but are for the Nigerian market," said Adekunle.

"The scale of the project, especially with limited resources, has required us to be creative in how we train the model, gather the data, compute and label what we have."

CLOSING THE AI LANGUAGE GAP

Africa is home to more than 2,000 languages spoken across 54 countries, according to the United Nations Educational, Scientific and Cultural Organization (UNESCO).

However, the majority of African languages remain underrepresented on the internet. English dominates the digital space, accounting for around 50% of all websites, followed by Spanish, German, Japanese, and French.

Along with the Nigerian government initiative, there are also a small but growing number of African startups rising to the challenge of developing AI tools in languages like Swahili, Amharic, Zulu and Sesotho.

In Kenya, for instance, health tech firm Jacaranda Health has pioneered the first LLM operating in Swahili to improve maternal healthcare in East Africa.

Built on Meta's Llama 3 system, UlizaLlama (AskLlama) aims to refine Jacaranda Health's SMS service for low-income Swahili-speaking expectant mothers who have queries ranging from dietary concerns and fetal movement to exercise during pregnancy.

The platform currently provides pre-written automated responses, but once UlizaLlama is integrated by the end of June, it will tailor responses to individual needs, offering more detailed pregnancy guidance and emergency support.

"A lot of these expectant moms can't just do a Google search. UlizaLlama's goal is to make sure that we get them the accurate answers in the fastest possible time," Jay Patel, Jacaranda Health's director of technology, told the Thomson Reuters Foundation.

"We're shooting for about 85% accuracy to start with and a faster response time. At the moment, it takes a few minutes to respond, but we are hoping to get that down to less than a minute in the future."

In South Africa, the Masakhane initiative is using open-source machine learning to translate African languages.

Lelapa AI, a South African AI research lab, has pioneered VulaVula – a for-profit language processing tool that translates, transcribes and analyses languages in English, Afrikaans, Zulu and Sesotho.

DATA SCARCITY, ETHICAL CONCERNS

But AI experts say building LLMs in African languages poses significant challenges, ranging from availability of data to ethical concerns over consent, compensation and copyright.

Many African languages are low-resource languages, meaning there is a scarcity of data to train these models effectively - unlike high-resource languages such as English or French.

Michael Michie, co-founder of Everse Technology Africa, an AI startup building intelligence into data protection and privacy, said collecting the data needed to train LLMs also raised ethical questions.

In many African communities, oral tradition predominates, and certain communities may not be interested in sharing their language to train LLMs and this should be respected.

"There are currently no regulations or laws in African countries that address issues related to consent, privacy and compensation to communities when collecting data to train AI tools - this needs to be addressed," said Michie.

"There are questions of who owns the language and who benefits. There needs to be guidelines to prevent exploitation and ensure the development of these LLMs benefits the people they are meant to serve," he added.

Open-source initiatives like Creative Commons, which allow creators to legally share their work with specified conditions like ensuring attribution and non-commercial use, are also not a perfect solution, said some AI experts.

"At the moment there's this push of saying everything should just be under Creative Commons," said Vukosi Marivate, associate professor of computer science at the University of Pretoria and co-founder of Lelapa AI.

But if everything is open source, it may be harder to properly reimburse and acknowledge the original contributors to these language models, he said.

"A lot of people are working on LLMs now because of the prestige, that's where the money is, but we need to make sure that our languages are actually being taken care of."



Microsoft to Invest $10 bn for Japan AI Data Centers

Microsoft's Vice Chair and President Brad Smith (4th L) and (L-R) Sakura Internet Inc President and CEO Kunihiro Tanaka, SoftBank Corp. President and CEO Junichi Miyakawa, Microsoft Japan President Miki Tsusaka, hold a meeitng with Japan's Prime Minister Sanae Takaichi (2nd R) and Vice Minister of Economy, Trade and Industry Toshiro Ino (R) at the Prime Minister's Office in Tokyo on April 3, 2026. Kazuhiro NOGI / POOL/AFP
Microsoft's Vice Chair and President Brad Smith (4th L) and (L-R) Sakura Internet Inc President and CEO Kunihiro Tanaka, SoftBank Corp. President and CEO Junichi Miyakawa, Microsoft Japan President Miki Tsusaka, hold a meeitng with Japan's Prime Minister Sanae Takaichi (2nd R) and Vice Minister of Economy, Trade and Industry Toshiro Ino (R) at the Prime Minister's Office in Tokyo on April 3, 2026. Kazuhiro NOGI / POOL/AFP
TT

Microsoft to Invest $10 bn for Japan AI Data Centers

Microsoft's Vice Chair and President Brad Smith (4th L) and (L-R) Sakura Internet Inc President and CEO Kunihiro Tanaka, SoftBank Corp. President and CEO Junichi Miyakawa, Microsoft Japan President Miki Tsusaka, hold a meeitng with Japan's Prime Minister Sanae Takaichi (2nd R) and Vice Minister of Economy, Trade and Industry Toshiro Ino (R) at the Prime Minister's Office in Tokyo on April 3, 2026. Kazuhiro NOGI / POOL/AFP
Microsoft's Vice Chair and President Brad Smith (4th L) and (L-R) Sakura Internet Inc President and CEO Kunihiro Tanaka, SoftBank Corp. President and CEO Junichi Miyakawa, Microsoft Japan President Miki Tsusaka, hold a meeitng with Japan's Prime Minister Sanae Takaichi (2nd R) and Vice Minister of Economy, Trade and Industry Toshiro Ino (R) at the Prime Minister's Office in Tokyo on April 3, 2026. Kazuhiro NOGI / POOL/AFP

Microsoft said Friday it will invest $10 billion in Japan over the next four years to build artificial intelligence data centers and related infrastructure.

Power-hungry data centers -- warehouse-like facilities that power AI tools from chatbots to image generators -- are springing up worldwide, and the sector is growing particularly fast in Asia.

Microsoft President Brad Smith met Japanese Prime Minister Sanae Takaichi at her office on Friday to announce the investment, said AFP.

Smith said in a statement that it was a "response to Japan's growing need for cloud and AI services".

Businesses in Japan, the world's fourth-largest economy, are keen to get ahead in the fast-moving AI field.

But data centers expansion there is constrained by limited space and relatively expensive electricity.

The US tech giant will collaborate with Japan's SoftBank Group and Sakura Internet to expand domestic tech infrastructure, it said in a press release.

It follows a $2.9 billion two-year investment Microsoft announced in 2024 to bolster the country's push into AI and strengthen its cyber defenses.

The investment unveiled Friday also includes funds to enhance cybersecurity partnerships with Japanese government agencies, and to train one million engineers in cooperation with telecom and tech giants NTT and NEC.

A rush to build data centers in the Asia-Pacific region, especially in India and Southeast Asia, has sparked concerns over the facilities' environmental impact.

That includes increased demand on electricity grids that are often reliant on fossil fuels, and on local water supplies used to cool the hot servers inside.

Microsoft says it has pledged to become carbon negative, zero-waste and "water positive" by 2030.

On Tuesday, the company announced plans to invest more than $1 billion in cloud and AI data center infrastructure and operations in Thailand over the next two years.


Kia to Sell Lower-priced Electric Vehicle in US

A KIA logo on an electric vehicle is seen on display at the Canadian International AutoShow in Toronto, Ontario, Canada, February 13, 2025. REUTERS/Carlos Osorio
A KIA logo on an electric vehicle is seen on display at the Canadian International AutoShow in Toronto, Ontario, Canada, February 13, 2025. REUTERS/Carlos Osorio
TT

Kia to Sell Lower-priced Electric Vehicle in US

A KIA logo on an electric vehicle is seen on display at the Canadian International AutoShow in Toronto, Ontario, Canada, February 13, 2025. REUTERS/Carlos Osorio
A KIA logo on an electric vehicle is seen on display at the Canadian International AutoShow in Toronto, Ontario, Canada, February 13, 2025. REUTERS/Carlos Osorio

Kia said Wednesday it will begin selling a lower-priced electric vehicle in the United States later this year as automakers work to recharge EV sales.

The Korean automaker said at the New York Auto Show it will offer the EV3 in the US market starting later this year, Reuters reported.

Automakers are facing a tougher EV market in the United States after Congress repealed the $7,500 EV tax credit last year but higher gasoline prices in recent weeks has prompted new interest in the EVs.


Passengers Stranded in Moving Traffic after Robotaxi Outage in China

This file photo taken on August 1, 2024 shows a general view of a driverless robotaxi autonomous vehicle developed as part of tech giant Baidu's Apollo Go self-driving project, in Wuhan, in central China's Hubei province. (Photo by PEDRO PARDO / AFP)
This file photo taken on August 1, 2024 shows a general view of a driverless robotaxi autonomous vehicle developed as part of tech giant Baidu's Apollo Go self-driving project, in Wuhan, in central China's Hubei province. (Photo by PEDRO PARDO / AFP)
TT

Passengers Stranded in Moving Traffic after Robotaxi Outage in China

This file photo taken on August 1, 2024 shows a general view of a driverless robotaxi autonomous vehicle developed as part of tech giant Baidu's Apollo Go self-driving project, in Wuhan, in central China's Hubei province. (Photo by PEDRO PARDO / AFP)
This file photo taken on August 1, 2024 shows a general view of a driverless robotaxi autonomous vehicle developed as part of tech giant Baidu's Apollo Go self-driving project, in Wuhan, in central China's Hubei province. (Photo by PEDRO PARDO / AFP)

Some robotaxi passengers were left stranded in the middle of fast-moving traffic in a major Chinese city after their driverless vehicles stopped running, according to police and media reports on Wednesday.

A preliminary investigation indicates more than 100 robotaxis came to a halt because of a “system malfunction,” police in the city of Wuhan said in a statement, without elaborating. No injuries were reported.

One passenger told Chinese media that their robotaxi stopped after turning a corner. An instruction on a screen read: “Driving system malfunction. Staff are expected to arrive in 5 minutes.” After no one showed up, the passenger pushed an SOS button and was told that staff were on their way. The car door could be opened, so the passenger got out on their own.

It is the first time a mass shutdown of robotaxis has been reported in China, The Associated Press said. In December, many of Waymo’s self-driving cars came to a stop in San Francisco because of a power outage.

The taxis in Wuhan are operated by Baidu, a major Chinese internet and AI company that is expanding its Apollo Go robotaxi business to overseas locations in Europe and the Mideast.

Baidu did not have any immediate comment.

Police said reports that taxis were coming to a halt started coming in around 9 p.m., while media reports said multiple people were rescued.

While some passengers were able to exit their taxis on their own, others were afraid to get out because their vehicle had stopped in the middle lane of a ring road with other vehicles passing on both sides, the reports said. Ring roads are elevated roads without traffic lights designed to move traffic quickly in urban areas.

Baidu operates hundreds of robotaxis in Wuhan, which hosted an early pilot project for the company.