Dark patterns in e-commerce: a dataset and its baseline evaluations

11/12/2022
by   Yuki Yada, et al.
0

Dark patterns, which are user interface designs in online services, induce users to take unintended actions. Recently, dark patterns have been raised as an issue of privacy and fairness. Thus, a wide range of research on detecting dark patterns is eagerly awaited. In this work, we constructed a dataset for dark pattern detection and prepared its baseline detection performance with state-of-the-art machine learning methods. The original dataset was obtained from Mathur et al.'s study in 2019, which consists of 1,818 dark pattern texts from shopping sites. Then, we added negative samples, i.e., non-dark pattern texts, by retrieving texts from the same websites as Mathur et al.'s dataset. We also applied state-of-the-art machine learning methods to show the automatic detection accuracy as baselines, including BERT, RoBERTa, ALBERT, and XLNet. As a result of 5-fold cross-validation, we achieved the highest accuracy of 0.975 with RoBERTa. The dataset and baseline source codes are available at https://github.com/yamanalab/ec-darkpattern.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2019

Passage Re-ranking with BERT

Recently, neural models pretrained on a language modeling task, such as ...
research
04/30/2018

Machine Learning for Exam Triage

In this project, we extend the state-of-the-art CheXNet (Rajpurkar et al...
research
06/14/2023

Assessing the Effectiveness of GPT-3 in Detecting False Political Statements: A Case Study on the LIAR Dataset

The detection of political fake statements is crucial for maintaining in...
research
02/17/2023

jazznet: A Dataset of Fundamental Piano Patterns for Music Audio Machine Learning Research

This paper introduces the jazznet Dataset, a dataset of fundamental jazz...
research
10/04/2018

SiMRX - A Simulation toolbox for MRX

SiMRX is a MRX simulation toolbox written in MATLAB for simulation of re...
research
04/26/2022

PLOD: An Abbreviation Detection Dataset for Scientific Documents

The detection and extraction of abbreviations from unstructured texts ca...
research
11/25/2022

MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts

Event detection (ED) identifies and classifies event triggers from unstr...

Please sign up or login with your details

Forgot password? Click here to reset