The Fallacy of Cyberbullying Detection Systems

Abstract: Automatic cyberbullying detection is a task of growing interest in the Natural Language Processing and Machine Learning communities. Not only is it challenging, but it is also important given how social networks have become a vital part of people’s lives and how dire the consequences to cyberbullying can be. In this work, we review the current state of the art and, grounded with a theoretical background on cyberbullying as phenomenon and an experiment to validate current practices, we infer that it is often misrepresented in the literature, leading to systems that would have little real-world application. Additionally, there is no uniformity regarding the methodology to evaluate said systems and the natural imbalance of datasets remains an issue. This paper aims not only to be an in depth survey to automatic cyberbullying detection, but also to direct future research on the subject towards a viewpoint that is more coherent with the definition and representation of the phenomenon, so that future systems can have a practical and impactful application.