I'm a Doctoral Researcher at the Computational Social Science (CSS) Department at GESIS - Leibniz Institute for the Social Sciences. I'm fortunate to be advised by Prof. Claudia Wagner. My work is located at the intersection of NLP and social science. I am particularly interested in solving problems in hate and abuse speech on online social media platforms. My research objectives and interests are outlined here.

Before GESIS, I was a research intern in Prof. Rajesh Sharma‘s Computational Social Science Group. If you'd like to chat with me, feel free to schedule a meeting through my Microsoft Calendar appointment link

Click for a surprise! 🤗

NerdTests.com says I'm an Uber-Dorky High Nerd.  Click here to take the Nerd Test, get nerdy images and jokes, and talk to others on the nerd forum!

News

Publications

ProvocationProbe: Instigating Hate Speech Dataset from Twitter

Abhay Kumar, Vigneshwaran Shankaran, Rajesh Sharma
Preprint
Paper

In the recent years online social media platforms has been flooded with hateful remarks such as racism, sexism, homophobia etc. As a result, there have been many measures taken by various social media platforms to mitigate the spread of hate-speech over the internet. One particular concept within the domain of hate speech is instigating hate, which involves provoking hatred against a particular community, race, colour, gender, religion or ethnicity. In this work, we introduce \textit{ProvocationProbe} - a dataset designed to explore what distinguishes instigating hate speech from general hate speech. For this study, we collected around twenty thousand tweets from Twitter, encompassing a total of nine global controversies. These controversies span various themes including racism, politics, and religion. In this paper, i) we present an annotated dataset after comprehensive examination of all the controversies, ii) we also highlight the difference between hate speech and instigating hate speech by identifying distinguishing features, such as targeted identity attacks and reasons for hate.

Analyzing Toxicity in Deep Conversations: A Reddit Case Study

Vigneshwaran Shankaran, Rajesh Sharma
Preprint
Paper Data

Online social media has become increasingly popular in recent years due to its ease of access and ability to connect with others. One of social media's main draws is its anonymity, allowing users to share their thoughts and opinions without fear of judgment or retribution. This anonymity has also made social media prone to harmful content, which requires moderation to ensure responsible and productive use. Several methods using artificial intelligence have been employed to detect harmful content. However, conversation and contextual analysis of hate speech are still understudied. Most promising works only analyze a single text at a time rather than the conversation supporting it. In this work, we employ a tree-based approach to understand how users behave concerning toxicity in public conversation settings. To this end, we collect both the posts and the comment sections of the top 100 posts from 8 Reddit communities that allow profanity, totaling over 1 million responses. We find that toxic comments increase the likelihood of subsequent toxic comments being produced in online conversations. Our analysis also shows that immediate context plays a vital role in shaping a response rather than the original post. We also study the effect of consensual profanity and observe overlapping similarities with non-consensual profanity in terms of user behavior and patterns.

Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems

Aditya Narayan Sankaran*, Vigneshwaran Shankaran*, Sampath Lonka, Rajesh Sharma
LREC-COLING'24
Paper Data

Rhymes and poems are a powerful medium for transmitting cultural norms and societal roles. However, the pervasive existence of gender stereotypes in these works perpetuates biased perceptions and limits the scope of individuals' identities. Past works have shown that stereotyping and prejudice emerge in early childhood, and developmental research on causal mechanisms is critical for understanding and controlling stereotyping and prejudice. This work contributes by gathering a dataset of rhymes and poems to identify gender stereotypes. We then propose a model with 97\% accuracy to address gender bias. Gender stereotypes were rectified using a Large Language Model (LLM) and human educators along with a survey comparing their effectiveness. The findings highlight the pervasive nature of gender stereotypes in literary works and reveal the potential of LLMs in rectifying gender stereotypes and encourage further research in this area. This study raises awareness and promotes inclusivity within artistic expressions, making a significant contribution to the discourse on gender equality.

Misinformation Concierge: A proof-of-concept with curated Twitter dataset on COVID-19 vaccination

Shakshi Sharma, Anwittaman Datta, Vigneshwaran Shankaran, Rajesh Sharma
CIKM'23 Demo track
Paper

We demonstrate the Misinformation Concierge, a powerful tool that provides actionable intelligence on misinformation prevalent in social media. Specifically, it uses language processing and machine learning tools to identify subtopics of discourse and discerns non/misleading posts; presents statistical reports for policy-makers to understand the big picture of prevalent misinformation in a timely manner; and recommends rebuttal messages for specific pieces of misinformation, identified from within the corpus of data - providing means to intervene and counter misinformation promptly. The Misinformation Concierge proof-of-concept using a curated dataset is accessible at: https://demo-frontend-uy34.onrender.com/

Miscellaneous

Last Updated: 13 December 2024