Do Names Echo Semantics? A Large-Scale Study of Identifiers Used in C++'s Named Casts
Developers relax restrictions on a type to reuse methods with other types. While type casts are prevalent, in weakly typed languages such as C++, they are also extremely permissive. If type conversions are performed without care, they can lead to software bugs. Therefore, there is a clear need to check whether a type conversion is essential and used adequately according to the developer's intent. In this paper, we propose a technique to judge the fidelity of type conversions from an explicit cast operation, using the identifiers in an assignment. We measure accord in the identifiers using entropy and use it to check if the semantics of the source expression in the cast match the semantics of the variable it is being assigned. We present the results of running our tool on 34 components of the Chromium project, which collectively account for 27MLOC. Our tool identified 1,368 cases of discord indicating potential anti-patterns in the usage of explicit casts. We performed a manual evaluation of a random-uniform sample of these cases. Our evaluation shows that our tool identified 25.6 and 28.04
READ FULL TEXT