CENTRIS: A Precise and Scalable Approach for Identifying Modified Open-Source Software Reuse

02/11/2021
by   Seunghoon Woo, et al.
0

Open-source software (OSS) is widely reused as it provides convenience and efficiency in software development. Despite evident benefits, unmanaged OSS components can introduce threats, such as vulnerability propagation and license violation. Unfortunately, however, identifying reused OSS components is a challenge as the reused OSS is predominantly modified and nested. In this paper, we propose CENTRIS, a precise and scalable approach for identifying modified OSS reuse. By segmenting an OSS code base and detecting the reuse of a unique part of the OSS only, CENTRIS is capable of precisely identifying modified OSS reuse in the presence of nested OSS components. For scalability, CENTRIS eliminates redundant code comparisons and accelerates the search using hash functions. When we applied CENTRIS on 10,241 widely-employed GitHub projects, comprising 229,326 versions and 80 billion lines of code, we observed that modified OSS reuse is a norm in software development, occurring 20 times more frequently than exact reuse. Nonetheless, CENTRIS identified reused OSS components with 91 application on average, whereas a recent clone detection technique, which does not take into account modified and nested OSS reuse, hardly reached 10 precision and 40

READ FULL TEXT
research
10/10/2011

Open Source Software: How Can Design Metrics Facilitate Architecture Recovery?

Modern software development methodologies include reuse of open source c...
research
03/18/2021

Interpretation-enabled Software Reuse Detection Based on a Multi-Level Birthmark Model

Software reuse, especially partial reuse, poses legal and security threa...
research
05/06/2023

LibAM: An Area Matching Framework for Detecting Third-party Libraries in Binaries

Third-party libraries (TPLs) are extensively utilized by developers to e...
research
04/13/2022

Software Supply Chain Map: How Reuse Networks Expand

Clone-and-own is a typical code reuse approach because of its simplicity...
research
04/27/2022

Towards Exploring the Code Reuse from Stack Overflow during Software Development

As one of the most well-known programmer Q A websites, Stack Overflow ...
research
07/31/2018

Sourcerer's Apprentice and the study of code snippet migration

On the worldwide web, not only are webpages connected but source code is...
research
08/16/2022

Don't Reinvent the Wheel: Towards Automatic Replacement of Custom Implementations with APIs

Reusing code is a common practice in software development: It helps deve...

Please sign up or login with your details

Forgot password? Click here to reset