A Dataset of Enterprise-Driven Open Source Software

02/10/2020
by   Diomidis Spinellis, et al.
0

We present a dataset of open source software developed mainly by enterprises rather than volunteers. This can be used to address known generalizability concerns, and, also, to perform research on open source business software development. Based on the premise that an enterprise's employees are likely to contribute to a project developed by their organization using the email account provided by it, we mine domain names associated with enterprises from open data sources as well as through white- and blacklisting, and use them through three heuristics to identify 17,252 enterprise GitHub projects. We provide these as a dataset detailing their provenance and properties. A manual evaluation of a dataset sample shows an identification accuracy of 89 data analysis we found that projects are staffed by a plurality of enterprise insiders, who appear to be pulling more than their weight, and that in a small percentage of relatively large projects development happens exclusively through enterprise insiders.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2019

Why Software Projects need Heroes (Lessons Learned from 1100+ Projects)

A "hero" project is one where 80 the 20 since they might cause bottlenec...
research
07/25/2023

BotHawk: An Approach for Bots Detection in Open Source Software Projects

Social coding platforms have revolutionized collaboration in software de...
research
03/08/2018

Automatic Detection of Public Development Projects in Large Open Source Ecosystems: An Exploratory Study on GitHub

Hosting over 10 million of software projects, GitHub is one of the most ...
research
03/23/2020

Characterizing the Roles of Contributors in Open-source Scientific Software Projects

The development of scientific software is, more than ever, critical to t...
research
08/16/2018

Do software firms collaborate or compete? A model of coopetition in community-initiated OSS projects

[Background] An increasing number of commercial firms are participating ...
research
04/24/2019

The VGG Image Annotator (VIA)

Manual image annotation, such as defining and labelling regions of inter...
research
10/26/2021

Measuring and Modeling Neighborhoods

With the availability of granular geographical data, social scientists a...

Please sign up or login with your details

Forgot password? Click here to reset