中文版 | English
Title

Research on the Factors that Influence P2P Loan Default: Analysis Based on Logistic Regression and Machine Learning

Alternative Title
P2P贷款违约影响因素研究:基于逻辑回归和机器学习的分析
Author
Name pinyin
PENG Xiaoru
School number
12032752
Degree
硕士
Discipline
0702Z1商务智能与大数据管理
Subject category of dissertation
07 理学
Supervisor
李少波
Mentor unit
信息系统与管理工程系
Publication Years
2022-05-07
Submission date
2022-06-26
University
南方科技大学
Place of Publication
深圳
Abstract

In recent years, P2P loans have developed rapidly in developing countries, making great contributions to reducing the poverty rate of the local population, supporting the development of SMEs, and improving the level of financial inclusion. However, there are also many problems in the development of P2P loans, for example, information asymmetry, and the default rate of P2P loans is significantly higher than that of bank loans, which may disrupt the financial market order. To relieve the information asymmetry in P2P loans, this paper proposes that the borrower's mobile phone usage patterns and bank transaction patterns, which are accessible to the P2P lending platforms predict the borrower's default probability. This paper uses a set of operational data from a P2P lending platform in Indonesia and estimates the impact of mobile phone usage patterns and bank transaction patterns on the borrower's default probability by constructing a Logistic Regression model. Moreover, this paper has conducted predictions on the default probability of a transaction through three machine learning models, including the Random Forest model, XGBoost model, and Deep Neural Network model respectively.

This paper has found that the mobile phone usage patterns and the bank transaction patterns of the borrowers could significantly influence the default rate of their P2P lending loans. From the perspective of phone calls, borrowers who have received phone calls from more people, shorter average call time and fewer night calls are less likely to default. In terms of phone recharge patterns, borrowers who recharge their mobile phone more frequently with a small amount have a lower default risk. In the view of social media, borrowers who have installed WhatsApp are less likely to default. For the bank transaction patterns, borrowers who have more bank transaction records and higher average transaction amounts have lower default risk. Moreover, with the factors mentioned above, high-precision prediction of the default risk of a loan transaction could be achieved through the machine learning models, among which the XGBoost model is superior to the other two models in both prediction accuracy and model stability. Therefore, the XGBoost is the preferred model.

This paper has a significant theoretical contribution and practical significance. From the perspective of theoretical contribution, this paper introduces some new predictors for the analysis of the factors that would influence the P2P loan default rate, so that the relevant research will no longer be limited to the personal information provided by the users, instead, factors that the P2P lending platforms could collect actively could be applied.

From a practical point of view, the research results of this paper point out that P2P lending platforms in developing countries can foresee the default risk of a transaction by using the relevant data of the borrower's mobile phone usage patterns and bank transaction patterns, prevent the occurrence of ultra-high-risk transactions, reduce the platform risk, ensure the safety of investors' funds so that the P2P platforms can develop sustainably and contribute to the realization of inclusive finance.

Other Abstract

近几年,P2P贷款模式在一些发展中国家的快速发展为降低当地的人口贫困率、促进中小微企业融资,实现普惠金融产生了积极影响。但是,P2P贷款发展过程中也存在诸多问题,如信息不对称,违约率显著高于银行贷款等,而这些问题都有可能扰乱金融市场秩序。本研究从解决P2P贷款信息不对称的问题出发,提出P2P平台可以用其获取到的借款人的手机使用相关因素和银行交易相关因素预测该借款人的违约概率。具体来讲,本课题利用一组来来自于印度尼西亚的某P2P平台的运营数据,以借款人的手机使用和银行交易等个人因素为自变量,通过构建逻辑回归模型估算了以上因素对借款人违约概率的影响,并通过随机森林模型、XGBoost模型和深层神经网络模型,利用上述数据对借款人违约概率进行预测。

本研究发现,借款人的手机使用相关因素和银行交易相关因素会显著影响其在P2P贷款中违约概率。从接听电话方面来看,电话联系人越多、平均通话时间越短且夜间通话少的借款人违约概率较低;其次,从手机充值模式来看,会经常给手机充值但每次充值金额较少的借款人违约概率较低;第三,在社交媒体方面,给手机安装了社交媒体软件WhatsApp的借款人违约概率较低;最后,从银行交易模式来看,有更多银行交易记录且平均交易金额更高的借款人违约风险更低。而且,利用上述因素,可以通过机器学习模型对借款人的违约概率实现高精度预测,其中XGBoost模型在预测准确性和模型稳定性方面都优于其他两个模型,可作为首选模型。

本研究有显著的理论贡献和实践意义。从理论贡献上来看,本研究为P2P贷款违约影响因素分析引入了新的预测因子,使其相关影响因素研究不再局限于用户主动提供的个人信息,P2P平台可以根据其主动收集的用户相关信息对用户的违约风险进行评估。

从实践地角度看,基于本文的研究成果,P2P平台可以利用借款人的手机使用相关数据和银行交易相关数据对该借款人的违约概率进行预测,防范超高风险交易产生,降低平台风险,保障投资人资金安全,实现平台长期可持续性发展,为进一步实现普惠金融做出贡献。

Keywords
Language
English
Training classes
独立培养
Enrollment Year
2020
Year of Degree Awarded
2022-07
References List

[1] ZHOU W, ARNER D W, BUCKLEY R P. Regulating FinTech in China: From Permissive to Balanced [J]. 2017.
[2] STEM C, MAKINEN M, QIAN Z. FinTechs in China: With a Special Focus on Peer to Peer Lending; proceedings of the International Monetary Review, October 2018, Vol5 , No4, F, 2018 [C].
[3] KPMG. Pulse of FinTech 2018 [R]: KPMG, 2018.
[4] KPMG. Pulse of FinTech 2020H2 [R]: KPMG, 2020.
[5] Indonesian Financial Literacy National Strategy (Revisit 2017) - Guidelines for Implementing Literacy and Inclusion Activities. [R]: Otoritas Jasa Keuangan, 2017.
[6] SALAMPASIS D, MENTION A L. FinTech: Harnessing Innovation for Financial Inclusion [J]. Handbook of Blockchain, Digital Finance, and Inclusion, Volume 2, 2018: 451-61.
[7] ZAVOLOKINA L, DOLATA M, SCHWABE G. FinTech – What's in a Name?; proceedings of the Thirty Seventh International Conference on Information Systems, F, 2016 [C].
[8] CUMMINS M, LYNN T, BHAIRD C, et al. Addressing Information Asymmetries in Online Peer-to-Peer Lending: FinTech and Strategy in the 21st Century [M]. Disrupting Finance, 2019.
[9] STIGLITZ J E, WEISS A. Credit Rationing in Markets with Imperfect Information [J]. Social Science Electronic Publishing.
[10] SERRANO-CINCA C, GUTIÉRREZ-NIETO B, LÓPEZ-PALACIOS L. Determinants of Default in P2P Lending [J]. PLOS ONE, 2015.
[11] BACHMANN A, BECKER A, BUERCKNER D, et al. Online Peer-to-Peer Lending – A Literature Review [J]. Journal of Internet Banking & Commerce, 2011, 16(2).
[12] SURYONO R R, BUDI I, PURWANDARI B. Detection of fintech P2P lending issues in Indonesia [J]. Heliyon, 2021, 7(4): e06782.
[13] LIN M, PRABHALA N, VISWANATHAN S. Judging Borrowers by the Company They Keep: Friendship Networks and Information Asymmetry in Online Peer-to-Peer Lending [J]. Social Science Electronic Publishing.
[14] HALES M G. Focusing on 15% of the pie [J]. bank marketing, 1995.
[15] EMEKTER R, TU Y, JIRASAKULDECH B, et al. Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending [J]. Applied Economics, 2015, 47(1-3): 54-70.
[16] ARYA S, ECKEL C, WICHMAN C. Anatomy of the Credit Score [J]. Journal of Economic Behavior & Organization, 2013, 95: 175-85.
[17] POKORNÁ M, SPONER M. Social Lending and Its Risks [J]. Procedia - Social and Behavioral Sciences, 2016, 220: 330-7.
[18] CHEN X, DING X Y, WANG B F. A Study of the Overdue Behaviors in Private Borrowing——Empirical Analysis Based on P2P Network Borrowing and Lending [J]. Finance Forum, 2013.
[19] ZHANGA Y, CHIA G, ZHANGA Z. Decision tree for credit scoring and discovery of significant features: an empirical analysis based on Chinese microfinance for farmers [J].
[20] GONZALEZ L, LOUREIRO Y K. When can a photo increase credit? The impact of lender and borrower profiles on online peer-to-peer loans [J]. Journal of Behavioral and Experimental Finance, 2014, 2: 44-58.
[21] MA L, ZHAO X, ZHOU Z, et al. A new aspect on P2P online lending default prediction using meta-level phone usage data in China [J]. Decision Support Systems, 2018, 111: 60-71.
[22] BAHRAMMIRZAEE A. A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems [J]. Neural Computing and Applications, 2010, 19(8): 1165-95.
[23] VEDALA R, KUMAR B R. An application of Naive Bayes classification for credit scoring in e-lending platform; proceedings of the 2012 International Conference on Data Science & Engineering (ICDSE), F, 2012 [C].
[24] BYANJANKAR A, HEIKKILÄ M, MEZEI J. Predicting Credit Risk in Peer-to-Peer Lending: A Neural Network Approach; proceedings of the Computational Intelligence, IEEE Symposium, F, 2015 [C].
[25] JIANG J, LIAO L, WANG Z, et al. Government Affiliation and Fintech Industry: The Peer-to-Peer Lending Platforms in China [J]. Social Science Electronic Publishing, 2018.
[26] MALEKIPIR BA ZARI M, AKSAKALLI V. Risk Assessment in Social Lending via Random Forests [J]. Expert Systems with Applications, 2015.
[27] PACELLI V, AZZOLLINI M. An Artificial Neural Network Approach for Credit Risk Management [J]. Journal of Intelligent Learning Systems and Applications, 2011, 3(2): 103-12.
[28] YU J, ZHU Y. A Data-Driven Approach to Predict Default Risk of Loan for Online Peer-to-Peer (P2P) Lending; proceedings of the Fifth International Conference on Communication Systems & Network Technologies, F, 2015 [C].
[29] HAMADANI A Z, SHALBAFZADEH A, REZVAN T, et al. An Integrated Genetic-Based Model of Naive Bayes Networks for Credit Scoring [J]. International Journal of Artificial Intelligence & Applications, 2013, 4(1): 85-103.
[30] BYANJANKAR A, HEIKKILA¨ M. Credit Risk Evaluation in Peer-to-peer Lending With Linguistic Data Transformation and Supervised Learning; proceedings of the 51st Hawaii International Conference on System Sciences HICSS-51, F, 2018 [C].
[31] GUTIERREZ-NIETO B, SERRANO-CINCA C, CAMON-CALA J. A Credit Score System for Socially Responsible Lending [J]. Journal of Business Ethics, 2016, 133(4): 691-701.
[32] ZDEMIR Z, BORAN L. An Empirical Investigation on Consumer Credit Default Risk [J]. Working Papers, 2004.
[33] 胡毅, 王珏, 杨晓光. 基于面板Logit模型的银行客户贷款违约风险预警研究 [J]. 系统工程理论与实践, 2015, 35(7): 8.
[34] SUNDSOY P R, BJELLAND J, REME B A, et al. Deep Learning Applied to Mobile Phone Data for Individual Income Classification; proceedings of the International conference on Artificial Intelligence: Technologies and Applications, F, 2016 [C].
[35] BARNETT W, CHAUVET M, LEIVA-LEON D, et al. Nowcasting nominal gdp with the credit-card augmented Divisia monetary aggregates [J]. MPRA Paper, 2016.
[36] DUARTE C, RODRIGUES P, RUA A. A mixed frequency approach to the forecasting of private consumption with ATM/POS data [J]. International Journal of Forecasting, 2017, 33(1): 61-75.
[37] CARLSEN M, STORGAARD P E. Dankort payments as a timely indicator of retail sales in Denmark [J]. Danmarks Nationalbank Copenhagen, 2010.
[38] LIKAMWA R, LIU Y, LANE N D, et al. MoodScope: building a mood sensor from smartphone usage patterns [J]. 2012.
[39] PIELOT M, DINGLER T, SAN J, et al. When Attention is not Scarce - Detecting Boredom from Mobile Phone Usage; proceedings of the UbiComp '15: ACM International Joint Conference on Pervasive and Ubiquitous Computing, F, 2015 [C].
[40] SHEN J, BRDICZKA O, LIU J. A study of Facebook behavior: What does it tell about your Neuroticism and Extraversion? [J]. Computers in Human Behavior, 2015, 45(45): 32-8.
[41] ZHAO S, RAMOS J, TAO J, et al. Discovering different kinds of smartphone users through their application usage behaviors; proceedings of the Acm International Joint Conference on Pervasive & Ubiquitous Computing, F, 2016 [C].
[42] BERNERTH J B, TAYLOR S G, WALKER H J, et al. An empirical investigation of dispositional antecedents and performance-related outcomes of credit scores [J]. Journal of Applied Psychology, 2012, 97(2): 469-78.
[43] POKHRIYAL N, JACQUES D C. Combining disparate data sources for improved poverty prediction and mapping [J]. Proceedings of the National Academy of Sciences of the United States of America, 2017: E9783-E92.
[44] J., BLUMENSTOCK, G., et al. Predicting poverty and wealth from mobile phone metadata [J]. Science, 2015, 350(6264).
[45] LIM S S. Oxford Handbook of Mobile Communication and Society [M]. Oxford Handbook of Mobile Communication and Society, 2020.

Academic Degree Assessment Sub committee
信息系统与管理工程系
Domestic book classification number
F830.51
Data Source
人工提交
Document TypeThesis
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/343008
DepartmentDepartment of Information Systems and Management Engineering
Recommended Citation
GB/T 7714
Peng XR. Research on the Factors that Influence P2P Loan Default: Analysis Based on Logistic Regression and Machine Learning[D]. 深圳. 南方科技大学,2022.
Files in This Item:
File Name/Size DocType Version Access License
12032752-彭筱茹-商学院.pdf(3885KB) Restricted Access--Fulltext Requests
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[彭筱茹]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[彭筱茹]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[彭筱茹]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.