The value of text for small business default prediction: A deep learning approach

by   Matthew Stevenson, et al.

Compared to consumer lending, Micro, Small and Medium Enterprise (mSME) credit risk modelling is particularly challenging, as, often, the same sources of information are not available. To mitigate limited data availability, it is standard policy for a loan officer to provide a textual loan assessment. In turn, this statement is analysed by a credit expert alongside any available standard credit data. In our paper, we exploit recent advances from the field of Deep Learning and Natural Language Processing (NLP), including the BERT (Bidirectional Encoder Representations from Transformers) model, to extract information from 60000+ textual assessments. We consider the performance in terms of AUC (Area Under the Curve) and Balanced Accuracy and find that the text alone is surprisingly effective for predicting default. Yet, when combined with traditional data, it yields no additional predictive capability. We do find, however, that deep learning with categorical embeddings is capable of producing a modest performance improvement when compared to alternative machine learning models. We explore how the loan assessments influence predictions, explaining why despite the text being predictive, no additional performance is gained. This exploration leads us to a series of recommendations on a new strategy for the collection of future mSME loan assessments.


page 1

page 2

page 3

page 4


Predicting Consumer Default: A Deep Learning Approach

We develop a model to predict consumer default based on deep learning. W...

UQ for Credit Risk Management: A deep evidence regression approach

Machine Learning has invariantly found its way into various Credit Risk ...

Neural Learning of Online Consumer Credit Risk

This paper takes a deep learning approach to understand consumer credit ...

Predicting Credit Risk for Unsecured Lending: A Machine Learning Approach

Since the 1990s, there have been significant advances in the technology ...

Bond Default Prediction with Text Embeddings, Undersampling and Deep Learning

The special and important problems of default prediction for municipal b...

Credit Default Mining Using Combined Machine Learning and Heuristic Approach

Predicting potential credit default accounts in advance is challenging. ...

Defining and comparing SICR-events for classifying impaired loans under IFRS 9

The IFRS 9 accounting standard requires the prediction of credit deterio...

Please sign up or login with your details

Forgot password? Click here to reset