Advertisement
Prevention of online harassment requires rapid detection of offensive, harassing, and negative social media posts, which in turn requires monitoring online interactions.
Current methods to obtain such social media data are either fully automated, and not interpretable or rely on a static set of keywords, which can quickly become outdated.
Neither method is very effective, according to Maya Srikanth, from California Institute of Technology (Caltech) in the US.
Related Articles
Advertisement
“On the other hand, keyword searching suffers from the speed at which online conversations evolve. New terms crop up and old terms change meaning, so a keyword that was used sincerely one day might be meant sarcastically the next,” she said.
The team, including Anima Anandkumar from Caltech, used GloVe (Global Vectors for Word Representation) model that uses machine-learning algorithms to discover new and relevant keywords.
Machine learning is an application of AI that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
GloVe is a word-embedding model, meaning that it represents words in a vector space, where the “distance” between two words is a measure of their linguistic or semantic similarity.
Starting with one keyword, this model can be used to find others that are closely related to that word to reveal clusters of relevant terms that are actually in use.
For example, searching Twitter for uses of “MeToo” in conversations yielded clusters of related hashtags like “SupportSurvivors,” “ImWithHer,” and “NotSilent.”
This approach gives researchers a dynamic and ever-evolving keyword set to search.
However, it is not enough just to know whether a certain conversation is related to the topic of interest; context matters, the researchers said.
For that, GloVe shows the extent to which certain keywords are related, providing input on how they are being used.
For example, in an online Reddit forum dedicated to misogyny, the word “female” was used in close association with the words “sexual,” “negative,” and “intercourse.”
In Twitter posts about the #MeToo movement, the word “female” was more likely to be associated with the terms “companies,” “desire,” and “victims.”