The CRM (Credit Risk Model) service offers enriched data using machine learning models to customers based on data from the euBusinessGraph.
The main focus has been to create a generic service for running autonomous and adaptive models in the cloud as opposed to having to manually update the models and update them with new data.
In the Proof of concept, we have developed a credit risk model that predicts the chance of a company going bankrupt in the next 12 months. This has been developed using different algorithms such as regression-, random forest, AdaBoost and neural network. The models have been deployed to the analytic service.
We have developed a credit risk model that predicts the chance of a company going bankrupt in the next 12 months. This has been developed using different algorithms such as regression-, random forest, AdaBoost and neural network. The models have been deployed to the analytic service.
The credit risk models are using the following data from euBusinessGraph:
- accounting information (BRREG)
- general company information (BRREG)
- bankruptcy information (BRREG)
- external remarks (external dataset)
The variables used in the models has been selected by domain experts and been examined in statistical analyses to ensure that the most significant variables are chosen. This analysis will also reveal to what degree the variables influence the models result, and if there are any multicollinearity present. External datasets that is used in combination with euBusinessGraph requires an attribute as a shared identifier. In case of combining Norwegian data from BRREG the key identifier is the organization number.
The analytic service has been implemented using microservices (docker). We are using Kubernetes for automatic deployment, scaling and the management of the microservices.
Banks, Insurance, Finance
Most of the challenges facing risk modeling today come back to a lack of business agility. The current process requires input from the innovation hub, databases department, IT department, engineers, data analysts and risk modeling experts . Each includes separate stages with lengthy waiting times. With poor data management practices, outdated technology, and complex internal processes, banks struggle to keep up with the pace of change. Meanwhile, there is increased pressure from the business units to release more products to extract more business from their current customer base and build a bigger market share. It is very much in the hands of the risk modeling department to close the circle as fast as possible to allow the bank/FI to offer more financial products and increase their business or at least defend their existing business against emerging competition.
One of the ways to stay ahead of the game is overhauling outdated credit modeling practices.
AI based credit risk modeling makes use of machine learning. It allows the platform to learn as it consumes more information and allows the bank to learn about users’ spending behavior and predict the likelihood of them repaying a loan within a given timeframe.
Traditionally, the credit risk models of banks and financial institutions were dependent on the historical data. However, advanced analytical solutions like big data have emerged to automate the whole process and impact how credit risk modeling is done. Open banking will lead customers and banks to share data to develop an entirely new financial ecosystem. The potential benefits of sharing data in credit risk modeling are substantial. Open banking and data sharing can facilitate a series of services of value to both borrowers and providers. Although the models have been evolved several times in recent years to be more accurate, data sharing could help businesses to get an edge over the past systems. A huge amount of insightful data being available to businesses could enable them to create superior credit risk models.
The main scientific and technological results accomplished are that we have created a service that is:
- Autonomous: the models are automatically trained, tested, validated and deployed,
- Adaptive: models are updated continuously based on new data, as it becomes available.
The models that are deployed to the CRM-S service are defined through using a model specification, a blueprint of the model defining the location of the data to train and test on, model parameters, features, how often the model should be updated, etc. Based on this specification, the models can run continuously without any human intervention. This allows the CRM-S to support analytics on a large scale.
Many companies also want to use internal data from sources such as CRM and ERP systems to achieve a higher degree of insight into their business. Since CRM-S is built using Docker and Kubernetes, customers can run the service in their own cloud to gain the power of CRM-S in combination with their own data.
The service is based on the following concepts;
- The euBusinessGraph consolidates company data from different companies across Europe to a single graph database.
- The cloud storage maintains all historical data that is needed to run the models. Data can be from the euBusinessGraph or from external sources. E.g. for the credit risk models a key data source are external remarks which is not available from any of the euBusinessGraphs data providers. External sources used can be restricted so that they are only available to the service itself.
- The container orchestration is where the management of the models occur. It is divided into three steps;
- Data processing: The data that is used in a model is retrieved from the cloud storage and processed so that it is ready to be consumed by the models.
- Machine learning: Training and testing of the models. The model output from this processed is stored in Azure Blob Storage, while metadata is stored in a document database.
- Deployment: The models is made available in an analytic service API and made available to the euBusinessGraph. A model registry maintains the different versions of a model.
The analytic service can also run on premise for customers who want to enrich the data in euBusinessGraph with their own data (e.g. a CRM system). In this case they will use the same orchestration as the cloud service, but they have to extract the extra information separately.