A dynamic credit scoring model based on survival gradient boosting decision tree approach
Abstract
Credit scoring, which is typically transformed into a classification problem, is a powerful tool to manage credit risk since it forecasts the probability of default (PD) of a loan application. However, there is a growing trend of integrating survival analysis into credit scoring to provide a dynamic prediction on PD over time and a clear explanation on censoring. A novel dynamic credit scoring model (i.e., SurvXGBoost) is proposed based on survival gradient boosting decision tree (GBDT) approach. Our proposal, which combines survival analysis and GBDT approach, is expected to enhance predictability relative to statistical survival models. The proposed method is compared with several common benchmark models on a real-world consumer loan dataset. The results of out-of-sample and out-of-time validation indicate that SurvXGBoost outperform the benchmarks in terms of predictability and misclassification cost. The incorporation of macroeconomic variables can further enhance performance of survival models. The proposed SurvXGBoost meanwhile maintains some interpretability since it provides information on feature importance.
First published online 14 December 2020
Keyword : credit scoring, survival analysis, survival gradient boosting decision tree, probability of default, consumer loan, machine learning
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Apostolik, R., Donohue, C., & Went, P. (2009). Foundations of banking risk: an overview of banking, banking risks, and risk-based banking regulation (Vol. 507). John Wiley & Sons Incorporated.
Baesens, B., Van Gestel, T., Stepanova, M., Van den Poel, D., & Vanthienen, J. (2005). Neural network survival analysis for personal loan data. Journal of the Operational Research Society, 56(9), 1089– 1098. https://doi.org/10.1057/palgrave.jors.2601990
Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., & Vanthienen, J. (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6), 627–635. https://doi.org/10.1057/palgrave.jors.2601545
Bellotti, T., & Crook, J. (2009). Credit scoring with macroeconomic variables using survival analysis. Journal of the Operational Research Society, 60(12), 1699–1707. https://doi.org/10.1057/jors.2008.130
Bellotti, T., & Crook, J. (2013). Forecasting and stress testing credit card default using dynamic models. International Journal of Forecasting, 29(4), 563–574. https://doi.org/10.1016/j.ijforecast.2013.04.003
Bequé, A., Coussement, K., Gayler, R., & Lessmann, S. (2017). Approaches for credit scorecard calibration: an empirical analysis. Knowledge-Based Systems, 134, 213–227. https://doi.org/10.1016/j.knosys.2017.07.034
Bequé, A., & Lessmann, S. (2017). Extreme learning machines for credit scoring: an empirical evaluation. Expert Systems with Applications, 86, 42–53. https://doi.org/10.1016/j.eswa.2017.05.050
Bergstra, J. S., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Algorithms for hyper-parameter optimization [Conference presentation]. 25th Annual Conference on Neural Information Processing Systems. Granada, Spain.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785
Chen, Y., Jia, Z., Mercola, D., & Xie, X. (2013). A gradient boosting algorithm for survival analysis via direct optimization of concordance index. Computational and Mathematical Methods in Medicine, 2013, Article 873595. https://doi.org/10.1155/2013/873595
Cox, D. R. (1972). Regression models and life‐tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–202. https://doi.org/10.1007/978-1-4612-4380-9_37
Crook, J. N., Edelman, D. B., & Thomas, L. C. (2007). Recent developments in consumer credit risk assessment. European Journal of Operational Research, 183(3), 1447–1465. https://doi.org/10.1016/j.ejor.2006.09.100
Dirick, L., Bellotti, T., Claeskens, G., & Baesens, B. (2019). Macro-economic factors in credit risk calculations: including time-varying covariates in mixture cure models. Journal of Business & Economic Statistics, 37(1), 40–53. https://doi.org/10.1080/07350015.2016.1260471
Dirick, L., Claeskens, G., & Baesens, B. (2017). Time to default in credit scoring using survival analysis: a benchmark study. Journal of the Operational Research Society, 68(6), 652–665. https://doi.org/10.1057/s41274-016-0128-9
Djeundje, V. B., & Crook, J. (2018). Incorporating heterogeneity and macroeconomic variables into multi-state delinquency models for credit cards. European Journal of Operational Research, 271(2), 697–709. https://doi.org/10.1016/j.ejor.2018.05.040
Djeundje, V. B., & Crook, J. (2019). Dynamic survival models with varying coefficients for credit risks. European Journal of Operational Research, 275(1), 319–333. https://doi.org/10.1016/j.ejor.2018.11.029
Finlay, S. (2011). Multiple classifier architectures and their application to credit risk assessment. European Journal of Operational Research, 210(2), 368–378. https://doi.org/10.1016/j.ejor.2010.09.029
Friedman, J. H. (2000). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29, 1189–1232. https://doi.org/10.1214/aos/1013203451
Han, L., & Ge, R. (2017). Wavelets analysis on structural model for default prediction. Computational Economics, 50(1), 111–140. https://doi.org/10.1007/s10614-016-9584-1
Hand, D. J. (2009). Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine Learning, 77(1), 103–123. https://doi.org/10.1007/s10994-009-5119-5
Hand, D. J., & Anagnostopoulos, C. (2014). A better Beta for the H measure of classification performance. Pattern Recognition Letters, 40, 41–46. https://doi.org/10.1016/j.patrec.2013.12.011
He, H., Zhang, W., & Zhang, S. (2018). A novel ensemble method for credit scoring: Adaption of different imbalance ratios. Expert Systems with Applications, 98, 105–117. https://doi.org/10.1016/j.eswa.2018.01.012
Huang, C.-L., Chen, M.-C., & Wang, C.-J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847–856. https://doi.org/10.1016/j.eswa.2006.07.007
Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 17(3), 299–310. https://doi.org/10.1109/TKDE.2005.50
Huang, Z., Jiang, T., & Wang, Z. (2020). On a multiple credit rating migration model with stochastic interest rate. Mathematical Methods in the Applied Sciences, 43(12), 7106–7134. https://doi.org/10.1002/mma.6435
Hung, N. T. (2019). Equity market integration of China and Southeast Asian countries: further evidence from MGARCH-ADCC and wavelet coherence analysis. Quantitative Finance and Economics, 3(2), 201–220. https://doi.org/10.3934/QFE.2019.2.201
Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860. https://doi.org/10.1214/08-AOAS169
Kartal, M. T. (2020). The behavior of Sovereign Credit Default Swaps (CDS) spread: evidence from Turkey with the effect of Covid-19 pandemic. Quantitative Finance and Economics, 4(3), 489–502. https://doi.org/10.3934/QFE.2020022
Klein, J. P., & Moeschberger, M. L. (2006). Survival analysis: techniques for censored and truncated data. Springer Science & Business Media.
Leow, M., & Crook, J. (2016). The stability of survival model parameter estimates for predicting the probability of default: Empirical evidence over the credit crisis. European Journal of Operational Research, 249(2), 457–464. https://doi.org/10.1016/j.ejor.2014.09.005
Lessmann, S., Baesens, B., Seow, H.-V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136. https://doi.org/10.1016/j.ejor.2015.05.030
Liang, J., Zhao, Y., & Zhang, X. (2016). Utility indifference valuation of corporate bond with credit rating migration by structure approach. Economic Modelling, 54, 339–346. https://doi.org/10.1016/j.econmod.2015.12.002
Lim, M. K., & Sohn, S. Y. (2007). Cluster-based dynamic scoring model. Expert Systems with Applications, 32(2), 427–431. https://doi.org/10.1016/j.eswa.2005.12.006
Liu, Y., Zheng, Y., & Drakeford, B. (2019). Reconstruction and dynamic dependence analysis of global economic policy uncertainty. Quantitative Finance and Economics, 3(3), 550–561. https://doi.org/10.3934/QFE.2019.3.550
Lohmann, C., & Ohliger, T. (2019). The total cost of misclassification in credit scoring: A comparison of generalized linear models and generalized additive models. Journal of Forecasting, 38(5), 375-389. https://doi.org/10.1002/for.2545
Ma, X., Sha, J., Wang, D., Yu, Y., Yang, Q., & Niu, X. (2018). Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning. Electronic Commerce Research and Applications, 31, 24–39. https://doi.org/10.1016/j.elerap.2018.08.002
Maldonado, S., Bravo, C., López, J., & Pérez, J. (2017). Integrated framework for profit-based feature selection and SVM classification in credit scoring. Decision Support Systems, 104, 113–121. https://doi.org/10.1016/j.dss.2017.10.007
Malik, M., & Thomas, L. C. (2010). Modelling credit risk of portfolio of consumer loans. Journal of the Operational Research Society, 61(3), 411–420. https://doi.org/10.1057/jors.2009.123
Munkhdalai, L., Wang, L., Park, H. W., & Ryu, K. H. (2019). Advanced neural network approach, its explanation with LIME for Credit scoring application. In N. Nguyen, F. Gaol, T. P. Hong, & B. Trawiński (Eds.), Lecture notes in computer science: Vol. 11432. Intelligent information and database systems (pp. 407–419). Springer. https://doi.org/10.1007/978-3-030-14802-7_35
Ong, C.-S., Huang, J.-J., & Tzeng, G.-H. (2005). Building credit scoring models using genetic programming. Expert Systems with Applications, 29(1), 41–47. https://doi.org/10.1016/j.eswa.2005.01.003
Sahin, Y., Bulkan, S., & Duman, E. (2013). A cost-sensitive decision tree approach for fraud detection. Expert Systems with Applications, 40(15), 5916–5923. https://doi.org/10.1016/j.eswa.2013.05.021
Shen, F., Wang, R., & Shen, Y. (2020). A cost-sensitive logistic regression credit scoring model based on multi-objective optimization approach. Technological and Economic Development of Economy, 26(2), 405–429. https://doi.org/10.3846/tede.2019.11337
Stepanova, M., & Thomas, L. (2002). Survival analysis methods for personal loan data. Operations Research, 50(2), 277–289. https://doi.org/10.1287/opre.50.2.277.426
Sukharev, O. S. (2020). Economic crisis as a consequence COVID-19 virus attack: risk and damage assessment. Quantitative Finance and Economics, 4(2), 274–293. https://doi.org/10.3934/QFE.2020013
Tong, E. N., Mues, C., & Thomas, L. C. (2012). Mixture cure models in credit scoring: If and when borrowers default. European Journal of Operational Research, 218(1), 132–139. https://doi.org/10.1016/j.ejor.2011.10.007
Wang, G., Ma, J., Huang, L., & Xu, K. (2012). Two credit scoring models based on dual strategy ensemble trees. Knowledge-Based Systems, 26(2), 61–68. https://doi.org/10.1016/j.knosys.2011.06.020
Wang, Z., Jiang, C., Ding, Y., Lv, X., & Liu, Y. (2018). A novel behavioral scoring model for estimating probability of default over time in Peer-to-Peer lending. Electronic Commerce Research and Applications, 27, 74–82. https://doi.org/10.1016/j.elerap.2017.12.006
West, D. (2000). Neural network credit scoring models. Computers & Operations Research, 27(11), 1131–1152. https://doi.org/10.1016/s0305-0548(99)00149-5
Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
Xia, Y., He, L., Li, Y., Liu, N., & Ding, Y. (2020a). Predicting loan default in peer‐to‐peer lending using narrative data. Journal of Forecasting, 39(2), 260–280. https://doi.org/10.1002/for.2625
Xia, Y., Liu, C., Da, B., & Xie, F. (2018a). A novel heterogeneous ensemble credit scoring model based on bstacking approach. Expert Systems with Applications, 93, 182–199. https://doi.org/10.1016/j.eswa.2017.10.022
Xia, Y., Liu, C., Li, Y., & Liu, N. (2017a). A boosted decision tree approach using Bayesian hyperparameter optimization for credit scoring. Expert Systems with Applications, 78, 225–241. https://doi.org/10.1016/j.eswa.2017.02.017
Xia, Y., Liu, C., & Liu, N. (2017b). Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electronic Commerce Research and Applications, 24, 30–49. https://doi.org/10.1016/j.elerap.2017.06.004
Xia, Y., Yang, X., & Zhang, Y. (2018b). A rejection inference technique based on contrastive pessimistic likelihood estimation for P2P lending. Electronic Commerce Research and Applications, 30, 111–124. https://doi.org/10.1016/j.elerap.2018.05.011
Xia, Y., Zhao, J., He, L., Li, Y., & Niu, M. (2020b). A novel tree-based dynamic heterogeneous ensemble method for credit scoring. Expert Systems with Applications, 159, Article 113615. https://doi.org/10.1016/j.eswa.2020.113615
Zhang, J., & Thomas, L. C. (2012). Comparisons of linear regression and survival analysis using single and mixture distributions approaches in modelling LGD. International Journal of Forecasting, 28(1), 204–215. https://doi.org/10.1016/j.ijforecast.2010.06.002