From Swahili to Zulu, African Techies Develop AI Language Tools

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)
TT

From Swahili to Zulu, African Techies Develop AI Language Tools

Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024. (Reuters)

When the Nigerian government announced plans in April to develop a multilingual AI tool to boost digital inclusion across the West African nation, 28-year-old computer science student Lwasinam Lenham Dilli was thrilled.

Dilli had struggled to scrape datasets from the internet to build a large language model (LLM), used to power AI chatbots, in his native Hausa language as part of his final-year project at university.

"I needed texts in English and their corresponding translation in Hausa but I couldn't get anything online, (there was) no clean data," Dilli told the Thomson Reuters Foundation.

"(Creating local language LLMs) is a way to ensure that our local dialects and languages will not be forgotten or left out of the AI ecosystem," he added.

The world has been swept up in a whirlwind of AI mania, with tools such as OpenAI's ChatGPT, Meta's Llama 2, and Mistral AI captivating millions globally with their ability to generate human-like text.

But for many tech-savvy Africans, the excitement has been tempered by a frustrating reality: when languages like Hausa, Amharic, or Kinyarwanda are entered into the chat, many of these advanced systems falter, often producing nonsensical responses.

Technology experts warn the lack of LLMs in African languages will lead to the exclusion of millions of people on the continent, increasing both the digital and economic divide.

The Nigerian government-led initiative to develop a multilingual LLM aims to level the playing field.

"The LLM will be trained on five low-resource languages and accented English to ensure stronger language representation ... for development of artificial intelligence solutions," said Nigeria's Digital Economy Minister Bosun Tijani in April.

The government will partner with Nigerian AI startups, and local data will be collected by volunteers who are fluent in any of five Nigerian languages: Yoruba, Hausa, Igbo, Ibibio, and West African lingua franca—Pidgin.

To build the model, the project will also draw on the expertise of more than 7,000 fellows from Nigeria's tech talent program - a government scheme to train three million people in skills such as coding and programming.

Silas Adekunle, co-founder of Awarri, an AI startup that is part of the initiative, said building a nuanced AI tool that understood Nigeria's unique language and cultural landscape presented many challenges.

"We have so many different accents and languages, and this (LLM) will enable many people and developers to build products that leverage AI but are for the Nigerian market," said Adekunle.

"The scale of the project, especially with limited resources, has required us to be creative in how we train the model, gather the data, compute and label what we have."

CLOSING THE AI LANGUAGE GAP

Africa is home to more than 2,000 languages spoken across 54 countries, according to the United Nations Educational, Scientific and Cultural Organization (UNESCO).

However, the majority of African languages remain underrepresented on the internet. English dominates the digital space, accounting for around 50% of all websites, followed by Spanish, German, Japanese, and French.

Along with the Nigerian government initiative, there are also a small but growing number of African startups rising to the challenge of developing AI tools in languages like Swahili, Amharic, Zulu and Sesotho.

In Kenya, for instance, health tech firm Jacaranda Health has pioneered the first LLM operating in Swahili to improve maternal healthcare in East Africa.

Built on Meta's Llama 3 system, UlizaLlama (AskLlama) aims to refine Jacaranda Health's SMS service for low-income Swahili-speaking expectant mothers who have queries ranging from dietary concerns and fetal movement to exercise during pregnancy.

The platform currently provides pre-written automated responses, but once UlizaLlama is integrated by the end of June, it will tailor responses to individual needs, offering more detailed pregnancy guidance and emergency support.

"A lot of these expectant moms can't just do a Google search. UlizaLlama's goal is to make sure that we get them the accurate answers in the fastest possible time," Jay Patel, Jacaranda Health's director of technology, told the Thomson Reuters Foundation.

"We're shooting for about 85% accuracy to start with and a faster response time. At the moment, it takes a few minutes to respond, but we are hoping to get that down to less than a minute in the future."

In South Africa, the Masakhane initiative is using open-source machine learning to translate African languages.

Lelapa AI, a South African AI research lab, has pioneered VulaVula – a for-profit language processing tool that translates, transcribes and analyses languages in English, Afrikaans, Zulu and Sesotho.

DATA SCARCITY, ETHICAL CONCERNS

But AI experts say building LLMs in African languages poses significant challenges, ranging from availability of data to ethical concerns over consent, compensation and copyright.

Many African languages are low-resource languages, meaning there is a scarcity of data to train these models effectively - unlike high-resource languages such as English or French.

Michael Michie, co-founder of Everse Technology Africa, an AI startup building intelligence into data protection and privacy, said collecting the data needed to train LLMs also raised ethical questions.

In many African communities, oral tradition predominates, and certain communities may not be interested in sharing their language to train LLMs and this should be respected.

"There are currently no regulations or laws in African countries that address issues related to consent, privacy and compensation to communities when collecting data to train AI tools - this needs to be addressed," said Michie.

"There are questions of who owns the language and who benefits. There needs to be guidelines to prevent exploitation and ensure the development of these LLMs benefits the people they are meant to serve," he added.

Open-source initiatives like Creative Commons, which allow creators to legally share their work with specified conditions like ensuring attribution and non-commercial use, are also not a perfect solution, said some AI experts.

"At the moment there's this push of saying everything should just be under Creative Commons," said Vukosi Marivate, associate professor of computer science at the University of Pretoria and co-founder of Lelapa AI.

But if everything is open source, it may be harder to properly reimburse and acknowledge the original contributors to these language models, he said.

"A lot of people are working on LLMs now because of the prestige, that's where the money is, but we need to make sure that our languages are actually being taken care of."



Alswaha: Saudi Arabia Leads International Indicators, Efforts to Bridge AI Gaps

Saudi Minister of Communications and Information Technology Abdullah Alswaha speaks at the event in New York. (SPA)
Saudi Minister of Communications and Information Technology Abdullah Alswaha speaks at the event in New York. (SPA)
TT

Alswaha: Saudi Arabia Leads International Indicators, Efforts to Bridge AI Gaps

Saudi Minister of Communications and Information Technology Abdullah Alswaha speaks at the event in New York. (SPA)
Saudi Minister of Communications and Information Technology Abdullah Alswaha speaks at the event in New York. (SPA)

Saudi Minister of Communications and Information Technology Abdullah Alswaha stressed on Tuesday that the Kingdom’s achievements represent the greatest digital success story of the 21st century.

This was possible by the support of Custodian of the Two Holy Mosques King Salman bin Abdulaziz Al Saud and the direct enablement by Prince Mohammed bin Salman bin Abdulaziz Al Saud, Crown Prince and Prime Minister, reflecting their ambitious vision for building a comprehensive technological future.

The minister made his remarks from New York during his participation in the high-level meeting of the United Nations General Assembly (UNGA) on the overall review of the implementation of the outcomes of the World Summit on the Information Society (WSIS).

Alswaha said that progress in the information society is reflected worldwide, with the number of internet users rising from around 800 million to nearly 6 billion.

The Kingdom ranked first globally on the ICT Development Index (IDI) issued by the UN International Telecommunication Union (ITU) and made remarkable progress in empowering women in the digital world, with female participation reaching approximately 36%, he revealed.

He highlighted that the foremost challenge today lies in bridging the gaps in artificial intelligence (AI), namely the computing gap, the data gap, and the algorithm gap.

Alswaha stated that the Kingdom leveraged its capabilities to boost advanced computing power and launch national language models that help close the data gap in the Arab world, including the AI model “ALLaM.”

Moreover, he noted global scientific achievements, such as Saudi scientist Omar Yaghi winning the 2025 Nobel Prize in Chemistry, reflecting Saudi Arabia’s scientific presence on the international stage.

He stressed that the achievements reflect the profound impact of the support from King Salman and Crown Prince Mohammed in consolidating the Kingdom’s global standing, enhancing its pivotal role in leading a more inclusive technological future, harnessing technologies for human benefit, supporting sustainable development, and aligning with the world’s aspirations for a more advanced and integrated era.


App Developers Urge EU Action on Apple Fee Practices 

An Apple logo adorns the façade of the downtown Brooklyn Apple store on March 14, 2020, in New York. (AP)
An Apple logo adorns the façade of the downtown Brooklyn Apple store on March 14, 2020, in New York. (AP)
TT

App Developers Urge EU Action on Apple Fee Practices 

An Apple logo adorns the façade of the downtown Brooklyn Apple store on March 14, 2020, in New York. (AP)
An Apple logo adorns the façade of the downtown Brooklyn Apple store on March 14, 2020, in New York. (AP)

A coalition of 20 app developers and consumer groups on Tuesday called upon European regulators to enforce EU laws against Apple, saying the company's fee structure unfairly disadvantages European developers compared to their US rivals after a recent court decision in the United States.

The European Union's Digital Markets Act (DMA), implemented in 2023, mandates that large tech platforms labelled "gatekeepers", such as Apple, facilitate in-app transactions outside their ecosystem at no charge.

The coalition's appeal reflects concerns over a disparity following a US court ruling that restricts Apple's ability to impose fees on external transactions.

The European Commission earlier this year fined Apple 500 million euros ($588 million) for breaching the DMA by obstructing developers from guiding users to alternative payment methods.

In response to the EU ruling, Apple revised its terms to impose fees ranging from 13% for smaller businesses to up to 20% for App Store purchases, alongside penalties of 5% to 15% on external transactions.

The Coalition for Apps Fairness (CAF), representing firms such as Deezer and Proton, argues these revised fees still violate DMA stipulations and says that US developers benefit from more favorable terms after the court decision.

"This situation is untenable and damaging to the app economy," CAF said in a statement, accusing Apple of undermining transparency and stifling innovation.

Global Policy Counsel for CAF, Gene Burrus, said that developers in the EU have to either bear the cost of those fees or pass them down to customers.

"It is bad for European companies, and it is bad for European consumers," he said.

According to CAF, European developers remain disadvantaged six months after the Commission declared Apple's policies illegal under the DMA.

Although Apple has announced further policy changes to take effect in January, it has yet to specify what these revisions will entail, fueling dissatisfaction among developers over the lack of clarity.

"We want the EU Commission to tell Apple that the law is the law and that free of charge means free of charge," Burrus said, adding that the European authorities should consider referring the issue to the European Court of Justice if necessary.


Will OpenAI Be the Next Tech Giant or Next Netscape?

While OpenAI does not expect to be profitable before 2029, the startup's valuation keeps climbing in funding rounds baffling some financial analysts. Kirill KUDRYAVTSEV / AFP
While OpenAI does not expect to be profitable before 2029, the startup's valuation keeps climbing in funding rounds baffling some financial analysts. Kirill KUDRYAVTSEV / AFP
TT

Will OpenAI Be the Next Tech Giant or Next Netscape?

While OpenAI does not expect to be profitable before 2029, the startup's valuation keeps climbing in funding rounds baffling some financial analysts. Kirill KUDRYAVTSEV / AFP
While OpenAI does not expect to be profitable before 2029, the startup's valuation keeps climbing in funding rounds baffling some financial analysts. Kirill KUDRYAVTSEV / AFP

Three years after ChatGPT made OpenAI the leader in artificial intelligence and a household name, rivals have closed the gap and some investors are wondering if the sensation has the wherewithal to stay dominant.

Investor Michael Burry, made famous in the film "The Big Short," recently likened OpenAI to Netscape, which ruled the web browser market in the mid-1990s only to lose to Microsoft's Internet Explorer.

"OpenAI is the next Netscape, doomed and hemorrhaging cash," Burry said recently in a post on X, formerly Twitter.

Researcher Gary Marcus, known for being skeptical of AI hype, sees OpenAI as having lost the lead it captured with the launch of ChatGPT in November 2022.

The startup is "burning billions of dollars a month," Marcus said of OpenAI.

"Given how long the writing has been on the wall, I can only shake my head" as it falls.

Yet ChatGPT was a tech launch like no other, breaking all consumer product growth records and now boasting more than 800 million -- paid subscription and unpaid -- weekly users.

OpenAI's valuation has soared to $500 billion in funding rounds, higher than any other private company.

But the ChatGPT maker will end this year with a loss of several billion dollars and does not expect to be profitable before 2029, an eternity in the fast-moving and uncertain world of AI.

Nonetheless, the startup has committed to paying more than $1.4 trillion to computer chip makers and data center builders to build infrastructure it needs for AI.

The fierce cash burn is raising questions, especially since Google claims some 650 million people use its Gemini AI monthly and the tech giant has massive online ad revenue to back its spending on technology.

Rivals Amazon, Meta and OpenAI-investor Microsoft have deep pockets the ChatGPT-maker cannot match.

Turbulence ahead?

A charismatic salesman, OpenAI chief executive Sam Altman flashed rare annoyance when asked about the startup's multi-trillion-dollar contracts in early November.

A few days later, he warned internally that the startup is likely to face a "turbulent environment" and an "unfavorable economic climate," particularly given competitive pressure from Google.

And when Google released its latest model to positive reactions, Altman issued a "red alert," urging OpenAI teams to give ChatGPT their best efforts.

OpenAI unveiled its latest ChatGPT model last week, that same day announcing Disney would invest in the startup and license characters for use in the bot and Sora video-generating tool.

OpenAI's challenge is inspiring the confidence that the large sums of money it is investing will pay off, according to Foundation Capital partner Ashu Garg.

For now OpenAI is raising money at lofty valuations while returns on those investments are questionable, Garg added.

Yet OpenAI still has the faith of the world's deepest-pocketed investors.

"I'm always expecting OpenAI's valuation to come down because competition is coming and its capital structure is so obviously inappropriate," said Pluris Valuation Advisors president Espen Robak.

"But it only seems to be going up."

Opinions are mixed on whether the situation will result in OpenAI postponing becoming a publicly traded company or instead make its way faster to Wall Street to cash in on the AI euphoria.

Few AI industry analysts expect OpenAI to implode completely, since there is room in the market for several models to thrive.

"At the end of the day, it's not winner take all," said CFRA analyst Angelo Zino.

"All of these companies will take a piece of the pie, and the pie continues to get bigger," he said of AI industry frontrunners.

Also factored in is that while OpenAI has made dizzying financial commitments, terms of deals tend to be flexible and Microsoft is a major backer of the startup.