Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to Protect Privacy of Individuals on Twitter

07/23/2022
by   Dilara Dogan, et al.
5

The recent advances in natural language processing have yielded many exciting developments in text analysis and language understanding models; however, these models can also be used to track people, bringing severe privacy concerns. In this work, we investigate what individuals can do to avoid being detected by those models while using social media platforms. We ground our investigation in two exposure-risky tasks, stance detection and geotagging. We explore a variety of simple techniques for modifying text, such as inserting typos in salient words, paraphrasing, and adding dummy social media posts. Our experiments show that the performance of BERT-based models fined tuned for stance detection decreases significantly due to typos, but it is not affected by paraphrasing. Moreover, we find that typos have minimal impact on state-of-the-art geotagging models due to their increased reliance on social networks; however, we show that users can deceive those models by interacting with different users, reducing their performance by almost 50

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2023

Detection of depression on social networks using transformers and ensembles

As the impact of technology on our lives is increasing, we witness incre...
research
09/21/2022

SMTCE: A Social Media Text Classification Evaluation Benchmark and BERTology Models for Vietnamese

Text classification is a typical natural language processing or computat...
research
10/02/2019

Neural Word Decomposition Models for Abusive Language Detection

User generated text on social media often suffers from a lot of undesire...
research
12/14/2022

ReDDIT: Regret Detection and Domain Identification from Text

In this paper, we present a study of regret and its expression on social...
research
11/16/2022

#maskUp: Selective Attribute Encryption for Sensitive Vocalization for English language on Social Media Platforms

Social media has become a platform for people to stand up and raise thei...
research
06/03/2016

Using Neural Generative Models to Release Synthetic Twitter Corpora with Reduced Stylometric Identifiability of Users

We present a method for generating synthetic versions of Twitter data us...
research
08/19/2023

HICL: Hashtag-Driven In-Context Learning for Social Media Natural Language Understanding

Natural language understanding (NLU) is integral to various social media...

Please sign up or login with your details

Forgot password? Click here to reset