An analytical process is provided, which analyzes the recent GPS traces of a vehicle and assigns to it a probability of having a crash in the next mid-term future (typically, one month ahead). The process is based on the analysis of historical data of several vehicles in terms of mobility (GPS positions), acceleration events (accelerations, decelerations, steering, etc., optional) and crash events. Each vehicle is described by a large set of mobility-based features, which are correlated to crashes through a machine learning model (Random Forests by default, easily replaceable with others).
The tool provides a learning phase and a prediciton phase. In the learning phase, it is required to provide the historical mobility traces of a set of vehicles (the larger the set and the deeper history, the better) and their association to crash events. The output is a predictive model. In the prediction phase, the model is applied to each user we want to examinate, based on their most recent mobility traces. The final output is a probability score, which can also be accompanied by a feature importance score, in order to highlight the most relevant features.
The tool provides a learning phase and a prediciton phase. In the learning phase, it is required to provide the historical mobility traces of a set of vehicles (the larger the set and the deeper history, the better) and their association to crash events. The output is a predictive model. In the prediction phase, the model is applied to each user we want to examinate, based on their most recent mobility traces. The final output is a probability score, which can also be accompanied by a feature importance score, in order to highlight the most relevant features.
Mobility data, in the form of GPS traces or equivalent, are growing more and more common in several application domains, thanks to the many modern location-based services and next-generation vehicles. Having tools for processing raw data and translate them into higher-level information, is fundamental to make value out of the collected information. IMNs provide a valuable functionality, by summarizing the individual mobility into a compact and intuitive graph representation.
The trained model can be useful at two levels. The first is the level of decision makers, especially for can insurances of policy makers, which can evaluate the cost risks of each vehicle (in economical terms for the former, as social costs for the latter); the second is the level of individual drivers, which can exploit the prediction risk and the feature importance to improve their risk.
The process provides the computation of several sophisticated mobility indicators, based on mobility statistics, acceleration events, individual mobility structure and contextual characteristics. Such indicators are the basis for the machine learning prediction. Also, a feature importance score is computed, that highlight the most important risk factors according to the ML model.