Detection and Analysis of Offensive Online Content in Hausa Language

doi:10.21203/rs.3.rs-4266465/v2

Download PDF

Research Article

Detection and Analysis of Offensive Online Content in Hausa Language

https://doi.org/10.21203/rs.3.rs-4266465/v2

This work is licensed under a CC BY 4.0 License

Version 2

posted

You are reading this latest preprint version

Hausa, a major Chadic language spoken by over 100 million people in Africa, faces a challenge in the digital age. While widely used, it is considered a low-resource language from a computational linguistic perspective. This means there are limited resources and tools to analyse Hausa text, making it difficult to detect offensive and threatening language online. Our study aimed to bridge this gap. We conducted two user studies (n = 180) to understand cyberbullying in Hausa. We then created the first-ever dataset of offensive and threatening Hausa phrases to train detection systems. We developed a system to flag such content and compared it to Google translation’s ability to detect these terms. Our findings revealed a concerning trend: offensive and threatening language is prevalent online, especially in discussions about religion and politics. Our detection system was able to detect more than 70% of offensive and threatening content, although many of these were mistranslated by Google’s translation engine. We attribute this to the subtle relationship between offensive and threatening content and idiomatic expressions in the Hausa language. This highlights the importance of considering cultural nuances and idiomatic expressions in Hausa. To create a safer online environment for Hausa speakers, we recommend involving diverse stakeholders who understand local contexts and demographics. This will allow for the development of more accurate detection systems and targeted moderation strategies.

Trigger Warning: Readers may find some of the terms in this study distressing or disturbing; all examples are for illustration only.

Low-Resource Language

Hausa Language

Offensive and

Cyberbullying

Online Social Media

The authors declare no competing interests.

Download PDF

Version 2

posted

You are reading this latest preprint version

Detection and Analysis of Offensive Online Content in Hausa Language

Status:

Version 2

Abstract

Full Text

Additional Declarations

Status:

Version 2