Exploring Machine Learning Techniques to Identify Important Factors Leading to Injury in Curve Related Crashes
Different factors have effects on traffic crashes and crash-related injuries. These factors include segment characteristics, crash-level characteristics, occupant level characteristics, environment characteristics, and vehicle level characteristics. There are several studies regarding these factors' effects on crash injuries. However, limited studies have examined the effects of pre-crash events on injuries, especially for curve-related crashes. The majority of previous studies for curve-related crashes focused on the impact of geometric features or street design factors. The current study tries to eliminate the aforementioned shortcomings by considering important pre-crash events related factors as selected variables and the number of vehicles with or without injury as the predicted variable. This research used CRSS data from the National Highway Traffic Safety Administration (NHTSA), which includes traffic crash-related data for different states in the USA. The relationships are explored using different machine learning algorithms like the random forest, C5.0, CHAID, Bayesian Network, Neural Network, C&R Tree, Quest, etc. The random forest and SHAP values are used to identify the most effective variables. The C5.0 algorithm, which has the highest accuracy rate among the other algorithms, is used to develop the final model. Analysis results revealed that the extent of the damage, critical pre-crash event, pre-impact location, the trafficway description, roadway surface condition, the month of the crash, the first harmful event, number of motor vehicles, attempted avoidance maneuver, and roadway grade affect the number of vehicles with or without injury in curve-related crashes.
READ FULL TEXT