Auto-Detection of Safety Issues in Baby Products
Every year, thousands of people receive consumer product related injuries. Research indicates that online customer reviews can be processed to autonomously identify product safety issues. Early identification of safety issues can lead to earlier recalls, and thus fewer injuries and deaths. A dataset of product reviews from Amazon.com was compiled, along with SaferProducts.gov complaints and recall descriptions from the Consumer Product Safety Commission (CPSC) and European Commission Rapid Alert system. A system was built to clean the collected text and to extract relevant features. Dimensionality reduction was performed by computing feature relevance through a Random Forest and discarding features with low information gain. Various classifiers were analyzed, including Logistic Regression, SVMs, Naïve-Bayes, Random Forests, and an Ensemble classifier. Experimentation with various features and classifier combinations resulted in a logistic regression model with 70.2% precision in the top 50 reviews surfaced. This classifier outperforms all benchmarks set by related literature and consumer product safety professionals.
READ FULL TEXT