From credit scoring to regulatory scoring: comparing credit scoring models from a regulatory perspective
Abstract
Conventional credit scoring models evaluated by predictive accuracy or profitability typically serve the financial institutions and can hardly reflect their contribution on financial stability. To remedy this, we develop a novel regulatory scoring framework to quantify and compare the corresponding regulatory capital charge errors of credit scoring models. As an application of RegTech, the proposed framework considers the characteristic of example-dependence and costsensitivity in credit scoring, which is expected to enhance the ability of risk absorption of financial institutions and thus benefit the regulators. Validated on two real-world credit datasets, empirical results reveal that credit scoring models with good predictive accuracy or profitability do not necessarily provide low capital charge requirement error, which further highlights the importance of regulatory scoring framework. The family of gradient boosting decision tree (GBDT) provides significantly better average performance than industry benchmarks and deep multilayer perceptron network, especially when financial stability is the primary focus. To further examine the robustness of the proposed regulatory scoring, sampling techniques, cut-off value modification, and probability calibration are employed within the framework and the main conclusions hold in most cases. Furthermore, the analysis on the interpretability via TreeSHAP algorithm alleviates the concerns on transparency of GBDT-based models, and confirms the important roles of loan characteristics, borrowers’ solvency and creditworthiness as powerful predictors in credit scoring. Finally, the managerial implications for both financial institutions and regulators are discussed.
Keyword : credit scoring, RegTech, regulatory scoring, probability of default, financial regulation, gradient boosting decision tree
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Ala’raj, M., & Abbod, M. F. (2016b). A new hybrid ensemble credit scoring model based on classifiers consensus system approach. Expert Systems with Applications, 64, 36–55. https://doi.org/10.1016/j.eswa.2016.07.017
Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4), 589–609. https://doi.org/10.1111/j.1540-6261.1968.tb00843.x
Anagnostopoulos, I. (2018). Fintech and regtech: Impact on regulators and banks. Journal of Economics and Business, 100, 7–25. https://doi.org/10.1016/j.jeconbus.2018.07.003
Arner, D. W., Barberis, J., & Buckey, R. P. (2016). FinTech, RegTech, and the reconceptualization of financial regulation. Northwestern Journal of International Law & Business, 37, 371. https://scholarlycommons.law.northwestern.edu/njilb/vol37/iss3/2
Bahnsen, A. C., Aouada, D., & Ottersten, B. (2014, December). Example-dependent cost-sensitive logistic regression for credit scoring. Proceedings of 13th International Conference on Machine Learning and Applications (ICMLA) (pp. 263–269). Detroit, MI, USA. IEEE. https://doi.org/10.1109/ICMLA.2014.48
Bahnsen, A. C., Aouada, D., & Ottersten, B. (2015). Example-dependent cost-sensitive decision trees. Expert Systems with Applications, 42(19), 6609–6619. https://doi.org/10.1016/j.eswa.2015.04.042
Basel Committee on Banking Supervision. (2005). An explanatory note on the Basel II IRB risk weight functions. Bank for International Settlements.
Baxter, L. G. (2016). Adaptive financial regulation and RegTech: A concept article on realistic protection for victims of bank failures. Duke Law Journal, 66(3), 567–604. https://scholarship.law.duke.edu/dlj/vol66/iss3/5
Bellotti, T., & Crook, J. (2009). Credit scoring with macroeconomic variables using survival analysis. Journal of the Operational Research Society, 60(12), 1699–1707. https://doi.org/10.1057/jors.2008.130
Bensic, M., Sarlija, N., & Zekic‐Susac, M. (2005). Modelling small‐business credit scoring by using logistic regression, neural networks and decision trees. Intelligent Systems in Accounting, Finance & Management, 13(3), 133–150. https://doi.org/10.1002/isaf.261
Bequé, A., Coussement, K., Gayler, R., & Lessmann, S. (2017). Approaches for credit scorecard calibration: An empirical analysis. Knowledge-Based Systems, 134, 213–227. https://doi.org/10.1016/j.knosys.2017.07.034
Bequé, A., & Lessmann, S. (2017). Extreme learning machines for credit scoring: An empirical evaluation. Expert Systems with Applications, 86, 42–53. https://doi.org/10.1016/j.eswa.2017.05.050
Brown, I., & Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Systems with Applications, 39(3), 3446–3453. https://doi.org/10.1016/j.eswa.2011.09.033
Chen, N., Ribeiro, B., & Chen, A. (2016). Financial credit risk assessment: A recent review. Artificial Intelligence Review, 45(1), 1–23. https://doi.org/10.1007/s10462-015-9434-x
Chi, B.-W., & Hsu, C.-C. (2012). A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model. Expert Systems with Applications, 39(3), 2650–2661. https://doi.org/10.1016/j.eswa.2011.08.120
Crone, S. F., & Finlay, S. (2012). Instance sampling in credit scoring: An empirical study of sample size and balancing. International Journal of Forecasting, 28(1), 224–238. https://doi.org/10.1016/j.ijforecast.2011.07.006
Crook, J. N., Edelman, D. B., & Thomas, L. C. (2007). Recent developments in consumer credit risk assessment. European Journal of Operational Research, 183(3), 1447–1465. https://doi.org/10.1016/j.ejor.2006.09.100
Dastile, X., Celik, T., & Potsane, M. (2020). Statistical and machine learning models in credit scoring: A systematic literature survey. Applied Soft Computing, 91, 106263. https://doi.org/10.1016/j.asoc.2020.106263
Demma, C. (2017). Credit scoring and the quality of business credit during the crisis. Economic Notes: Review of Banking, Finance and Monetary Economics, 46(2), 269–306. https://doi.org/10.1111/ecno.12080
Duarte, J., Han, X., Harford, J., & Young, L. (2008). Information asymmetry, information dissemination and the effect of regulation FD on the cost of capital. Journal of Financial Economics, 87(1), 24–44. https://doi.org/10.1016/j.jfineco.2006.12.005
Eisenbeis, R. A. (1977). Pitfalls in the application of discriminant analysis in business, finance, and economics. The Journal of Finance, 32(3), 875–900. https://doi.org/10.2307/2326320
Feng, X., Xiao, Z., Zhong, B., Dong, Y., & Qiu, J. (2019). Dynamic weighted ensemble classification for credit scoring using Markov Chain. Applied Intelligence, 49(2), 555–568. https://doi.org/10.1007/s10489-018-1253-8
Finlay, S. (2010). Credit scoring for profitability objectives. European Journal of Operational Research, 202(2), 528–537. https://doi.org/10.1016/j.ejor.2009.05.025
Florez-Lopez, R., & Ramon-Jeronimo, J. M. (2015). Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal. Expert Systems with Applications, 42(13), 5737–5753. https://doi.org/10.1016/j.eswa.2015.02.042
Gordy, M. B. (2003). A risk-factor model foundation for ratings-based bank capital rules. Journal of Financial Intermediation, 12(3), 199–232. https://doi.org/10.1016/S1042-9573(03)00040-8
Gorton, G., & Ordonez, G. (2014). Collateral crises. American Economic Review, 104(2), 343–378. https://doi.org/10.1257/aer.104.2.343
Gunnarsson, B. R., Vanden Broucke, S., Baesens, B., Óskarsdóttir, M., & Lemahieu, W. (2021). Deep learning for credit scoring: Do or don’t? European Journal of Operational Research, 295(1), 292–305. https://doi.org/10.1016/j.ejor.2021.03.006
Hand, D. J. (2009). Measuring classifier performance: A coherent alternative to the area under the ROC curve. Machine Learning, 77(1), 103–123. https://doi.org/10.1007/s10994-009-5119-5
Hanson, S. G., Kashyap, A. K., & Stein, J. C. (2011). A macroprudential approach to financial regulation. Journal of Economic Perspectives, 25(1), 3–28. https://doi.org/10.1257/jep.25.1.3
He, H., Zhang, W., & Zhang, S. (2018). A novel ensemble method for credit scoring: Adaption of different imbalance ratios. Expert Systems with Applications, 98, 105–117. https://doi.org/10.1016/j.eswa.2018.01.012
Herasymovych, M., Märka, K., & Lukason, O. (2019). Using reinforcement learning to optimize the acceptance threshold of a credit scoring model. Applied Soft Computing, 84, 105697. https://doi.org/10.1016/j.asoc.2019.105697
Hoese, S., & Huschens, S. (2013). Stochastic orders and non-Gaussian risk factor models. Review of Managerial Science, 7(2), 99–140. https://doi.org/10.1007/s11846-011-0071-8
Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 17(3), 299–310. https://doi.org/10.1109/TKDE.2005.50
Hurlin, C., Leymarie, J., & Patin, A. (2018). Loss functions for Loss Given Default model comparison. European Journal of Operational Research, 268(1), 348–360. https://doi.org/10.1016/j.ejor.2018.01.020
Kadan, O., Madureira, L., Wang, R., & Zach, T. (2009). Conflicts of interest and stock recommendations: The effects of the global settlement and related regulations. The Review of Financial Studies, 22(10), 4189–4217. https://doi.org/10.1093/rfs/hhn109
Kavassalis, P., Stieber, H., Breymann, W., Saxton, K., & Gross, F. J. (2018). An innovative RegTech approach to financial risk monitoring and supervisory reporting. The Journal of Risk Finance, 19(1), 39–55. https://doi.org/10.1108/JRF-07-2017-0111
Lessmann, S., Baesens, B., Seow, H.-V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136. https://doi.org/10.1016/j.ejor.2015.05.030
Li, Z., Tian, Y., Li, K., Zhou, F., & Yang, W. (2017). Reject inference in credit scoring using Semi-supervised Support Vector Machines. Expert Systems with Applications, 74, 105–114. https://doi.org/10.1016/j.eswa.2017.01.011
Ling, C. X., & Sheng, V. S. (2011). Cost-sensitive learning. In C. Sammut & G. I. Webb (Eds.), Encyclopedia of machine learning (pp. 231–235): Springer. https://doi.org/10.1007/978-0-387-30164-8_181
Lohmann, C., & Ohliger, T. (2019). The total cost of misclassification in credit scoring: A comparison of generalized linear models and generalized additive models. Journal of Forecasting, 38(5), 375–389. https://doi.org/10.1002/for.2545
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 30, 4765–4774.
Lundberg, S. M., Erion, G. G., & Lee, S.-I. (2018). Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888
Ma, L., Zhao, X., Zhou, Z., & Liu, Y. (2018). A new aspect on P2P online lending default prediction using meta-level phone usage data in China. Decision Support Systems, 111, 60–71. https://doi.org/10.1016/j.dss.2018.05.001
Ma, X., Sha, J., Wang, D., Yu, Y., Yang, Q., & Niu, X. (2018). Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning. Electronic Commerce Research and Applications, 31, 24–39. https://doi.org/10.1016/j.elerap.2018.08.002
Maldonado, S., Peters, G., & Weber, R. (2020). Credit scoring using three-way decisions with probabilistic rough sets. Information Sciences, 507, 700–714. https://doi.org/10.1016/j.ins.2018.08.001
Malekipirbazari, M., & Aksakalli, V. (2015). Risk assessment in social lending via random forests. Expert Systems with Applications, 42(10), 4621–4631. https://doi.org/10.1016/j.eswa.2015.02.001
Marqués, A. I., García, V., & Sánchez, J. S. (2013). On the suitability of resampling techniques for the class imbalance problem in credit scoring. Journal of the Operational Research Society, 64(7), 1060–1070. https://doi.org/10.1057/jors.2012.120
Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates. The Journal of Finance, 29(2), 449–470. https://doi.org/10.2307/2978814
Moosa, I. A. (2010). Basel II as a casualty of the global financial crisis. Journal of Banking Regulation, 11(2), 95–114. https://doi.org/10.1057/jbr.2010.2
Moscato, V., Picariello, A., & Sperlí, G. (2021). A benchmark of machine learning approaches for credit score prediction. Expert Systems with Applications, 165, 113986. https://doi.org/10.1016/j.eswa.2020.113986
Óskarsdóttir, M., Bravo, C., Sarraute, C., Vanthienen, J., & Baesens, B. (2019). The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics. Applied Soft Computing, 74, 26–39. https://doi.org/10.1016/j.asoc.2018.10.004
Papouskova, M., & Hajek, P. (2019). Two-stage consumer credit risk modelling using heterogeneous ensemble learning. Decision Support Systems, 118, 33–45. https://doi.org/10.1016/j.dss.2019.01.002
People’s Bank of China. (2019). China financial stability report 2019. http://www.pbc.gov.cn/en/3688235/3688414/3710021/index.html
Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In A. J. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large-margin classifiers (Vol. 10, pp. 61–74). MIT Press.
Pławiak, P., Abdar, M., & Acharya, U. R. (2019). Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Applied Soft Computing, 84, 105740. https://doi.org/10.1016/j.asoc.2019.105740
Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure. Neural Computation, 14(1), 21–41. https://doi.org/10.1162/089976602753284446
Schotten, P. C., & Morais, D. C. (2019). A group decision model for credit granting in the financial market. Financial Innovation, 5(1), 1–19. https://doi.org/10.1186/s40854-019-0126-4
Serrano-Cinca, C., & Gutiérrez-Nieto, B. (2016). The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending. Decision Support Systems, 89(2), 113–122. https://doi.org/10.1016/j.dss.2016.06.014
Shen, F., Wang, R., & Shen, Y. (2019). A cost-sensitive logistic regression credit scoring model based on multi-objective optimization approach. Technological and Economic Development of Economy, 1–25. https://doi.org/10.3846/tede.2019.11337
Sun, J., Lang, J., Fujita, H., & Li, H. (2018). Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Information Sciences, 425, 76–91. https://doi.org/10.1016/j.ins.2017.10.017
Tang, L., Cai, F., & Ouyang, Y. (2019). Applying a nonparametric random forest algorithm to assess the credit risk of the energy industry in China. Technological Forecasting and Social Change, 144, 563–572. https://doi.org/10.1016/j.techfore.2018.03.007
Thomas, L. C. (2000). A survey of credit and behavioural scoring: Forecasting financial risk of lending to consumers. International Journal of Forecasting, 16(2), 149–172. https://doi.org/10.1016/S0169-2070(00)00034-0
Tsai, C.-F., & Wu, J.-W. (2008). Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Systems with Applications, 34(4), 2639–2649. https://doi.org/10.1016/j.eswa.2007.05.019
Verbraken, T., Bravo, C., Weber, R., & Baesens, B. (2014). Development and application of consumer credit scoring models using profit-based classification measures. European Journal of Operational Research, 238(2), 505–513. https://doi.org/10.1016/j.ejor.2014.04.001
Wiginton, J. C. (1980). A note on the comparison of logit and discriminant models of consumer credit behavior. Journal of Financial and Quantitative Analysis, 15(3), 757–770. https://doi.org/10.2307/2330408
Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
Xia, Y. (2019). A novel reject inference model using outlier detection and gradient boosting technique in peer-to-peer lending. IEEE Access, 7, 92893–92907. https://doi.org/10.1109/ACCESS.2019.2927602
Xia, Y., He, L., Li, Y., Fu, Y., & Xu, Y. (2021a). A dynamic credit scoring model based on survival gradient boosting decision tree approach. Technological and Economic Development of Economy, 27(1), 96–119. https://doi.org/10.3846/tede.2020.13997
Xia, Y., He, L., Li, Y., Liu, N., & Ding, Y. (2020a). Predicting loan default in peer‐to‐peer lending using narrative data. Journal of Forecasting, 39(2), 260–280. https://doi.org/10.1002/for.2625
Xia, Y., Li, Y., He, L., Xu, Y., & Meng, Y. (2021b). Incorporating multilevel macroeconomic variables into credit scoring for online consumer lending. Electronic Commerce Research and Applications, 49, 101095. https://doi.org/10.1016/j.elerap.2021.101095
Xia, Y., Liu, C., Da, B., & Xie, F. (2018a). A novel heterogeneous ensemble credit scoring model based on bstacking approach. Expert Systems with Applications, 93, 182–199. https://doi.org/10.1016/j.eswa.2017.10.022
Xia, Y., Liu, C., & Liu, N. (2017a). Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electronic Commerce Research and Applications, 24, 30–49. https://doi.org/10.1016/j.elerap.2017.06.004
Xia, Y., Liu, C., Li, Y., & Liu, N. (2017b). A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Systems with Applications, 78, 225–241. https://doi.org/10.1016/j.eswa.2017.02.017
Xia, Y., Yang, X., & Zhang, Y. (2018b). A rejection inference technique based on contrastive pessimistic likelihood estimation for P2P lending. Electronic Commerce Research and Applications, 30, 111–124. https://doi.org/10.1016/j.elerap.2018.05.011
Xia, Y., Zhao, J., He, L., Li, Y., & Niu, M. (2020b). A novel tree-based dynamic heterogeneous ensemble method for credit scoring. Expert Systems with Applications, 159, 113615. https://doi.org/10.1016/j.eswa.2020.113615
Xiao, J., Wang, Y., Chen, J., Xie, L., & Huang, J. (2021). Impact of resampling methods and classification models on the imbalanced credit scoring problems. Information Sciences, 569, 508–526. https://doi.org/10.1016/j.ins.2021.05.029
Xiao, J., Zhou, X., Zhong, Y., Xie, L., Gu, X., & Liu, D. (2020). Cost-sensitive semi-supervised selective ensemble model for customer credit scoring. Knowledge-Based Systems, 189, 105118. https://doi.org/10.1016/j.knosys.2019.105118
Xu, D., Zhang, X., & Feng, H. (2019). Generalized fuzzy soft sets theory‐based novel hybrid ensemble credit scoring model. International Journal of Finance & Economics, 24(2), 903–921. https://doi.org/10.1002/ijfe.1698
Yu, L., Li, X., Tang, L., Zhang, Z., & Kou, G. (2015). Social credit: a comprehensive literature review. Financial Innovation, 1(1), 1–18. https://doi.org/10.1186/s40854-015-0005-6
Yu, L., Wang, S., & Lai, K. K. (2008). Credit risk assessment with a multistage neural network ensemble learning approach. Expert Systems with Applications, 34(2), 1434–1444. https://doi.org/10.1016/j.eswa.2007.01.009
Yu, L., Yue, W., Wang, S., & Lai, K. K. (2010). Support vector machine based multiagent ensemble learning for credit risk evaluation. Expert Systems with Applications, 37(2), 1351–1360. https://doi.org/10.1016/j.eswa.2009.06.083