Frozen Binomials on the Web: Word Ordering and Language Conventions in Online Text

03/07/2020
by   Katherine Van Koevering, et al.
0

There is inherent information captured in the order in which we write words in a list. The orderings of binomials — lists of two words separated by `and' or `or' — has been studied for more than a century. These binomials are common across many areas of speech, in both formal and informal text. In the last century, numerous explanations have been given to describe what order people use for these binomials, from differences in semantics to differences in phonology. These rules describe primarily `frozen' binomials that exist in exactly one ordering and have lacked large-scale trials to determine efficacy. Online text provides a unique opportunity to study these lists in the context of informal text at a very large scale. In this work, we expand the view of binomials to include a large-scale analysis of both frozen and non-frozen binomials in a quantitative way. Using this data, we then demonstrate that most previously proposed rules are ineffective at predicting binomial ordering. By tracking the order of these binomials across time and communities we are able to establish additional, unexplored dimensions central to these predictions. Expanding beyond the question of individual binomials, we also explore the global structure of binomials in various communities, establishing a new model for these lists and analyzing this structure for non-frozen and frozen binomials. Additionally, novel analysis of trinomials — lists of length three — suggests that none of the binomials analysis applies in these cases. Finally, we demonstrate how large data sets gleaned from the web can be used in conjunction with older theories to expand and improve on old questions.

READ FULL TEXT
research
05/31/2022

LEXpander: applying colexification networks to automated lexicon expansion

Recent approaches to text analysis from social media and other corpora r...
research
08/11/2016

The statistical trade-off between word order and word structure - large-scale evidence for the principle of least effort

Languages employ different strategies to transmit structural and grammat...
research
12/11/2017

Infinite and Bi-infinite Words with Decidable Monadic Theories

We study word structures of the form (D,<,P) where D is either N or Z, <...
research
02/01/2020

Novel Language Resources for Hindi: An Aesthetics Text Corpus and a Comprehensive Stop Lemma List

This paper is an effort to complement the contributions made by research...
research
04/13/2015

Egyptian Dialect Stopword List Generation from Social Network Data

This paper proposes a methodology for generating a stopword list from on...
research
10/22/2018

Who Filters the Filters: Understanding the Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking

Ad and tracking blocking extensions are among the most popular browser e...
research
01/10/2018

Buying Online - A Characterization of Rational Buying Procedures

In decision theory, an agent chooses from a set of alternatives. When bu...

Please sign up or login with your details

Forgot password? Click here to reset