From Swahili to Zulu, African Techies Develop AI Language Tools

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)

10:39-17 June 2024 AD ـ 10 Thul-Hijjah 1445 AH

From Swahili to Zulu, African Techies Develop AI Language Tools

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)

When the Nigerian government announced plans in April to develop a multilingual AI tool to boost digital inclusion across the West African nation, 28-year-old computer science student Lwasinam Lenham Dilli was thrilled.

Dilli had struggled to scrape datasets from the internet to build a large language model (LLM), used to power AI chatbots, in his native Hausa language as part of his final-year project at university.

"I needed texts in English and their corresponding translation in Hausa but I couldn't get anything online, (there was) no clean data," Dilli told the Thomson Reuters Foundation.

"(Creating local language LLMs) is a way to ensure that our local dialects and languages will not be forgotten or left out of the AI ecosystem," he added.

The world has been swept up in a whirlwind of AI mania, with tools such as OpenAI's ChatGPT, Meta's Llama 2, and Mistral AI captivating millions globally with their ability to generate human-like text.

But for many tech-savvy Africans, the excitement has been tempered by a frustrating reality: when languages like Hausa, Amharic, or Kinyarwanda are entered into the chat, many of these advanced systems falter, often producing nonsensical responses.

Technology experts warn the lack of LLMs in African languages will lead to the exclusion of millions of people on the continent, increasing both the digital and economic divide.

The Nigerian government-led initiative to develop a multilingual LLM aims to level the playing field.

"The LLM will be trained on five low-resource languages and accented English to ensure stronger language representation ... for development of artificial intelligence solutions," said Nigeria's Digital Economy Minister Bosun Tijani in April.

The government will partner with Nigerian AI startups, and local data will be collected by volunteers who are fluent in any of five Nigerian languages: Yoruba, Hausa, Igbo, Ibibio, and West African lingua franca—Pidgin.

To build the model, the project will also draw on the expertise of more than 7,000 fellows from Nigeria's tech talent program - a government scheme to train three million people in skills such as coding and programming.

Silas Adekunle, co-founder of Awarri, an AI startup that is part of the initiative, said building a nuanced AI tool that understood Nigeria's unique language and cultural landscape presented many challenges.

"We have so many different accents and languages, and this (LLM) will enable many people and developers to build products that leverage AI but are for the Nigerian market," said Adekunle.

"The scale of the project, especially with limited resources, has required us to be creative in how we train the model, gather the data, compute and label what we have."

CLOSING THE AI LANGUAGE GAP

Africa is home to more than 2,000 languages spoken across 54 countries, according to the United Nations Educational, Scientific and Cultural Organization (UNESCO).

However, the majority of African languages remain underrepresented on the internet. English dominates the digital space, accounting for around 50% of all websites, followed by Spanish, German, Japanese, and French.

Along with the Nigerian government initiative, there are also a small but growing number of African startups rising to the challenge of developing AI tools in languages like Swahili, Amharic, Zulu and Sesotho.

In Kenya, for instance, health tech firm Jacaranda Health has pioneered the first LLM operating in Swahili to improve maternal healthcare in East Africa.

Built on Meta's Llama 3 system, UlizaLlama (AskLlama) aims to refine Jacaranda Health's SMS service for low-income Swahili-speaking expectant mothers who have queries ranging from dietary concerns and fetal movement to exercise during pregnancy.

The platform currently provides pre-written automated responses, but once UlizaLlama is integrated by the end of June, it will tailor responses to individual needs, offering more detailed pregnancy guidance and emergency support.

"A lot of these expectant moms can't just do a Google search. UlizaLlama's goal is to make sure that we get them the accurate answers in the fastest possible time," Jay Patel, Jacaranda Health's director of technology, told the Thomson Reuters Foundation.

"We're shooting for about 85% accuracy to start with and a faster response time. At the moment, it takes a few minutes to respond, but we are hoping to get that down to less than a minute in the future."

In South Africa, the Masakhane initiative is using open-source machine learning to translate African languages.

Lelapa AI, a South African AI research lab, has pioneered VulaVula – a for-profit language processing tool that translates, transcribes and analyses languages in English, Afrikaans, Zulu and Sesotho.

DATA SCARCITY, ETHICAL CONCERNS

But AI experts say building LLMs in African languages poses significant challenges, ranging from availability of data to ethical concerns over consent, compensation and copyright.

Many African languages are low-resource languages, meaning there is a scarcity of data to train these models effectively - unlike high-resource languages such as English or French.

Michael Michie, co-founder of Everse Technology Africa, an AI startup building intelligence into data protection and privacy, said collecting the data needed to train LLMs also raised ethical questions.

In many African communities, oral tradition predominates, and certain communities may not be interested in sharing their language to train LLMs and this should be respected.

"There are currently no regulations or laws in African countries that address issues related to consent, privacy and compensation to communities when collecting data to train AI tools - this needs to be addressed," said Michie.

"There are questions of who owns the language and who benefits. There needs to be guidelines to prevent exploitation and ensure the development of these LLMs benefits the people they are meant to serve," he added.

Open-source initiatives like Creative Commons, which allow creators to legally share their work with specified conditions like ensuring attribution and non-commercial use, are also not a perfect solution, said some AI experts.

"At the moment there's this push of saying everything should just be under Creative Commons," said Vukosi Marivate, associate professor of computer science at the University of Pretoria and co-founder of Lelapa AI.

But if everything is open source, it may be harder to properly reimburse and acknowledge the original contributors to these language models, he said.

"A lot of people are working on LLMs now because of the prestige, that's where the money is, but we need to make sure that our languages are actually being taken care of."

Russia Confirms Ban on WhatsApp, Says No Plans to Block Google

Men pose with smartphones in front of displayed Whatsapp logo in this illustration September 14, 2017. REUTERS/Dado Ruvic/File Photo

Asharq Al Awsat

13:08-12 February 2026 AD ـ 24 Sha’ban 1447 AH

Asharq Al Awsat

13:08-12 February 2026 AD ـ 24 Sha’ban 1447 AH

Russia Confirms Ban on WhatsApp, Says No Plans to Block Google

Men pose with smartphones in front of displayed Whatsapp logo in this illustration September 14, 2017. REUTERS/Dado Ruvic/File Photo

Russia has blocked the popular messaging service WhatsApp over its failure to comply with local legislation, the Kremlin said Thursday, urging its 100 million Russian users to switch to a domestic alternative.

Moscow has for months been trying to shift Russian users onto Max, a domestic messaging service that lacks end-to-end encryption and that activists have called a potential tool for surveillance.

"As for the blocking of WhatsApp ... such a decision was indeed made and implemented," Kremlin spokesman Dmitry Peskov told reporters.

Peskov said the decision was due to WhatsApp's "reluctance to comply with the norms and letter of Russian law".

"Max is an accessible alternative, a developing messenger, a national messenger. And it is an alternative available on the market for citizens," he said.

Anton Gorelkin, a member of the Russian parliament and vice chair of its IT committee, said on Thursday that there were no plans to block Google in Russia.

WhatsApp, owned by US social media giant Meta, said Wednesday that it believed Russia was attempting to fully block the service in a bid to force users onto Max.

"We continue to do everything we can to keep users connected," it said.

Technology

Samsung Starts Mass Production of Next-gen AI Memory Chip

A man walks past the logo of Samsung Electronics displayed on a glass door at the company's Seocho building in Seoul on January 29, 2026. (Photo by Jung Yeon-je / AFP)

Asharq Al Awsat

07:13-12 February 2026 AD ـ 24 Sha’ban 1447 AH

Asharq Al Awsat

07:13-12 February 2026 AD ـ 24 Sha’ban 1447 AH

Samsung Starts Mass Production of Next-gen AI Memory Chip

A man walks past the logo of Samsung Electronics displayed on a glass door at the company's Seocho building in Seoul on January 29, 2026. (Photo by Jung Yeon-je / AFP)

Samsung Electronics has started mass production of a next-generation memory chip to power artificial intelligence, the South Korean firm announced Thursday, touting an "industry-leading" breakthrough.

The high-bandwidth "HBM4" chips are a key component for AI data centers, with US tech giant Nvidia -- now the world's most valuable company -- widely expected to be one of Samsung's main customers.

Samsung said it had "begun mass production of its industry-leading HBM4 and has shipped commercial products to customers".

"This achievement marks a first in the industry, securing an early leadership position in the HBM4 market," AFP quoted it as saying in a statement.

A global frenzy to build AI data centers has sent orders for advanced, high-bandwidth memory microchips soaring.

South Korea's two chip giants, SK hynix and Samsung, have been racing to start HBM4 production.

Taipei-based research firm TrendForce predicts that memory chip industry revenue will surge to a global peak of more than $840 billion in 2027.

The South Korean government has pledged to become one of the world's top three AI powers, alongside the United States and China.

Samsung and SK hynix are among the leading producers of high-performance memory chips.

Technology

Siemens Energy Trebles Profit as AI Boosts Power Demand

FILED - 05 August 2025, Berlin: The "Siemens Energy" logo can be seen in the entrance area of the company. Photo: Britta Pedersen/dpa

Asharq Al Awsat

10:40-11 February 2026 AD ـ 23 Sha’ban 1447 AH

Asharq Al Awsat

10:40-11 February 2026 AD ـ 23 Sha’ban 1447 AH

Siemens Energy Trebles Profit as AI Boosts Power Demand

FILED - 05 August 2025, Berlin: The "Siemens Energy" logo can be seen in the entrance area of the company. Photo: Britta Pedersen/dpa

German turbine maker Siemens Energy said Wednesday that its quarterly profits had almost tripled as the firm gains from surging demand for electricity driven by the artificial intelligence boom.

The company's gas turbines are used to generate electricity for data centers that provide computing power for AI, and have been in hot demand as US tech giants like OpenAI and Meta rapidly build more of the sites.

Net profit in the group's fiscal first quarter, to end-December, climbed to 746 million euros ($889 million) from 252 million euros a year earlier.

Orders -- an indicator of future sales -- increased by a third to 17.6 billion euros.

The company's shares rose over five percent in Frankfurt trading, putting the stock up about a quarter since the start of the year and making it the best performer to date in Germany's blue-chip DAX index.

"Siemens Energy ticked all of the major boxes that investors were looking for with these results," Morgan Stanley analysts wrote in a note, adding that the company's gas turbine orders were "exceptionally strong".

US data center electricity consumption is projected to more than triple by 2035, according to the International Energy Agency, and already accounts for six to eight percent of US electricity use.

Asked about rising orders on an earnings call, Siemens Energy CEO Christian Bruch said he thought the first-quarter figures were not "particularly strong" and that further growth could be expected.

"Demand for gas turbines is extremely high," he said. "We're talking about 2029 and 2030 for delivery dates."

Siemens Energy, spun out of the broader Siemens group in 2020, said last week that it would spend $1 billion expanding its US operations, including a new equipment plant in Mississippi as part of wider plans that would create 1,500 jobs.

Its shares have increased over tenfold since 2023, when the German government had to provide the firm with credit guarantees after quality problems at its wind-turbine unit.

From Swahili to Zulu, African Techies Develop AI Language Tools

From Swahili to Zulu, African Techies Develop AI Language Tools

Most Viewed

Russia Confirms Ban on WhatsApp, Says No Plans to Block Google

Russia Confirms Ban on WhatsApp, Says No Plans to Block Google

Samsung Starts Mass Production of Next-gen AI Memory Chip

Samsung Starts Mass Production of Next-gen AI Memory Chip

Siemens Energy Trebles Profit as AI Boosts Power Demand

Siemens Energy Trebles Profit as AI Boosts Power Demand

لم تشترك بعد