– the need to include African rural and other minorities
“While current debates around data justice and AI ethics are ongoing, we should bear in mind that we first need data. The representativity of minorities in data sets, as recognised by data justice thinkers, can only be achieved if there are sufficient data from minorities,” said Jean Louis Fendji of the School of Chemical Engineering and Mineral Industries at the University of Ngaoundéré in Cameroon.
“It’s not just about access – if your data is not included, you are excluded,” he added.
Fendji is an Iso Lomso fellow currently in his first residence at STIAS.
Digital divide
“2.7 billion! This figure represents one-third of humanity that remains offline, a strong indicator of the digital divide. Defined as the gap between demographics and regions with access to modern information and communication technology versus those without, the digital divide has long disadvantaged rural areas by limiting access to digital tools and the internet,” said Fendji.
This gap in access to ICT between demographic groups is based on income, race, geography and education with the Global South the most excluded. Fendji noted that the share of population using the internet ranges from just under 90% in the USA, Europe and Central Asia to just under 30% in Sub-Saharan Africa and South Asia, with height of the countries with lowest internet penetration in Africa in 2021.
But not all connectivity is the same, and a sizable portion of people in least-developed countries who are deemed to be connected only have Level-1 access, which means limited usually due to poor connectivity, resource constraints, low literacy levels, and lack of relevant content. This situation is worse in rural areas and populations.
“Meaningful connectivity ranges from internet to enrich day-to-day life to the internet as the backbone of society and of the future,” said Fendji.
In this regard very few African countries are likely to meet Sustainable Development Goal 9 which focuses on technological innovation.
AI divide
But when it comes to AI, the situation becomes even worse. “As technology progresses, this divide has evolved into the AI divide,” explained Fendji, “a phenomenon where disparities in access to AI technologies and the capacity to utilise them further widen the gap between urban and rural areas.”
Many of the countries with low internet access have also not been granted access to AI tools like ChatGPT. The reasons for restricting access are diverse and include technical challenges and resource limitations but as of 2023, 16 African countries were not included. And it gets more complicated – AI models depend on the availability of training data. Generative AI models (GAI) are trained using huge data sets upon which they generate new content, including text, images, audio and video, contributing to the amplification of the datafication process.
“AI models are not trained with our data, they are trained with data from somewhere else. This leads to data bias, and possible algorithm bias.”
“This AI divide poses a greater threat to rural areas than the access divide,” said Fendji. He emphasised that what people need is not only access to the internet but relevant, context-specific information to improve their daily life. “The lived experiences and cultural artefacts and values of some communities will not be registered digitally and will be excluded from the synthetic world images generated by GAI systems,” he explained.
He illustrated the situation beautifully with the story of the goat and the paper which highlighted that even within a country different communities have very different life experiences. “In some villages in sub-Saharan Africa, goats are used to eating paper without being harmed. Such a truth is only known by a minority and may not be online since those who know this truth are mainly offline, or even online, but with a low digital literacy level that prevents them from producing content that can later be used in the development of AI systems.”
The story also emphasised that it’s not just any data but quality data that’s needed. Quoting Andrew Ng (co-founder of Google Brain, head of the AI Fund and Time100 AI Most Influential People in 2023), Fendji emphasised: “We need better data, not bigger data.”
“There is a growing gap in terms of access, resources, capability, and knowledge to develop, adopt, and benefit from AI,” he continued. “And this is a data justice issue. We urgently need to raise awareness among marginalised communities, developers, researchers and decision-makers.”
And why does Fendji believe Africa and specifically rural areas matter? Because of the challenges and untapped potential. These areas face social, agricultural and environmental challenges that haven’t been addressed by existing AI solutions. Inclusion could lead to the development of innovative AI models specifically tailored to these contexts.
“Half of global population growth between now and 2050 is expected to occur in Africa,” he added. “Sub-Saharan Africa holds 60% of the world’s uncultivated arable land.” But improving productivity and thus food supply for the global population means understanding local information and collecting data on crop selection and rotation, soil analysis, and changing water parameters and including it in relevant, accessible AI interfaces.
Fendji’s work to date has been in agricultural and also education where teacher shortages make it necessary to provide relevant information even via basic, ‘unintelligent’ devices – not the smartphones taken for granted in other areas of the world. He hopes to develop innovative AI interfaces that take cognisance of local realities. He has also been involved in efforts to transform multipurpose community telecentres into community networks, as well as in advocacy for a regulatory framework dedicated to community networks in Francophone countries in Africa.
Including oral languages
His STIAS project specifically looks at the language aspects. And Africa has many languages – for example, in Nigeria there are 520, in Cameroon 273 and in South Africa 20. And AI doesn’t ‘speak’ or include most of them.
Fendji pointed out that user interfaces are primarily text-based and not in local languages, which are most often oral. Recent advances in natural language processing, means that textless speech-to-speech is now possible but up to now has only been used with written languages. He aims to investigate the possibility of using an African oral language in a voice-based form using mobile phones. He will initially focus on the vocabulary used in agriculture but, if successful, it could be used in many other important areas.
He concluded with a heartfelt plea: “We need to increase collaborations. We must help rural areas survive. We don’t have a choice but to tackle the problem with small steps – and the first is collecting data. Change management is needed. Many people don’t even know what’s going on and everyone must be involved in the development of solutions if you want them to adopt them. Ownership is very important. You have to involve people from the beginning.”
“It’s about lack of infrastructure and investment and also sometimes about government vision – they could solve more problems if they wanted to. We have to try to reduce the harm and growing injustice for such communities.”
Michelle Galloway: Part-time media officer at STIAS
Photograph: Noloyiso Mtembu