Large language models shape and are shaped by society: A survey of arXiv publication patterns

by   Rajiv Movva, et al.

There has been a steep recent increase in the number of large language model (LLM) papers, producing a dramatic shift in the scientific landscape which remains largely undocumented through bibliometric analysis. Here, we analyze 388K papers posted on the CS and Stat arXivs, focusing on changes in publication patterns in 2023 vs. 2018-2022. We analyze how the proportion of LLM papers is increasing; the LLM-related topics receiving the most attention; the authors writing LLM papers; how authors' research topics correlate with their backgrounds; the factors distinguishing highly cited LLM papers; and the patterns of international collaboration. We show that LLM research increasingly focuses on societal impacts: there has been an 18x increase in the proportion of LLM-related papers on the Computers and Society sub-arXiv, and authors newly publishing on LLMs are more likely to focus on applications and societal impacts than more experienced authors. LLM research is also shaped by social dynamics: we document gender and academic/industry disparities in the topics LLM authors focus on, and a US/China schism in the collaboration network. Overall, our analysis documents the profound ways in which LLM research both shapes and is shaped by society, attesting to the necessity of sociotechnical lenses.


page 1

page 2

page 3

page 4


Mapping Researcher Activity based on Publication Data by means of Transformers

Modern performance on several natural language processing (NLP) tasks ha...

Is it reasonable to limit scientific coauthorship? There is no inflation of co-authors in Social Sciences and Education in Spain

This paper analyzes the evolution of coauthorship in Spain in the social...

Hyperauthored papers disproportionately amplify important egocentric network metrics

Hyperauthorship, a phenomenon whereby there are a disproportionately lar...

A continuous integration and web framework in support of the ATLAS Publication Process

The ATLAS collaboration defines methods, establishes procedures, and org...

Meta-Research: COVID-19 medical papers have fewer women first authors than expected

The COVID-19 pandemic has resulted in school closures and distancing req...

Scientific Computing in the Cavendish Laboratory and the pioneering women Computors

The use of computers and the role of women in radio astronomy and X-ray ...

Academic Co-authorship is a Risky Game

Conducting a research project with multiple participants is a complex ta...

Please sign up or login with your details

Forgot password? Click here to reset