中文版 | English

Research on the Factors that Influence P2P Loan Default: Analysis Based on Logistic Regression and Machine Learning

Alternative Title
Name pinyin
PENG Xiaoru
School number
Subject category of dissertation
07 理学
Mentor unit
Publication Years
Submission date
Place of Publication

In recent years, P2P loans have developed rapidly in developing countries, making great contributions to reducing the poverty rate of the local population, supporting the development of SMEs, and improving the level of financial inclusion. However, there are also many problems in the development of P2P loans, for example, information asymmetry, and the default rate of P2P loans is significantly higher than that of bank loans, which may disrupt the financial market order. To relieve the information asymmetry in P2P loans, this paper proposes that the borrower's mobile phone usage patterns and bank transaction patterns, which are accessible to the P2P lending platforms predict the borrower's default probability. This paper uses a set of operational data from a P2P lending platform in Indonesia and estimates the impact of mobile phone usage patterns and bank transaction patterns on the borrower's default probability by constructing a Logistic Regression model. Moreover, this paper has conducted predictions on the default probability of a transaction through three machine learning models, including the Random Forest model, XGBoost model, and Deep Neural Network model respectively.

This paper has found that the mobile phone usage patterns and the bank transaction patterns of the borrowers could significantly influence the default rate of their P2P lending loans. From the perspective of phone calls, borrowers who have received phone calls from more people, shorter average call time and fewer night calls are less likely to default. In terms of phone recharge patterns, borrowers who recharge their mobile phone more frequently with a small amount have a lower default risk. In the view of social media, borrowers who have installed WhatsApp are less likely to default. For the bank transaction patterns, borrowers who have more bank transaction records and higher average transaction amounts have lower default risk. Moreover, with the factors mentioned above, high-precision prediction of the default risk of a loan transaction could be achieved through the machine learning models, among which the XGBoost model is superior to the other two models in both prediction accuracy and model stability. Therefore, the XGBoost is the preferred model.

This paper has a significant theoretical contribution and practical significance. From the perspective of theoretical contribution, this paper introduces some new predictors for the analysis of the factors that would influence the P2P loan default rate, so that the relevant research will no longer be limited to the personal information provided by the users, instead, factors that the P2P lending platforms could collect actively could be applied.

From a practical point of view, the research results of this paper point out that P2P lending platforms in developing countries can foresee the default risk of a transaction by using the relevant data of the borrower's mobile phone usage patterns and bank transaction patterns, prevent the occurrence of ultra-high-risk transactions, reduce the platform risk, ensure the safety of investors' funds so that the P2P platforms can develop sustainably and contribute to the realization of inclusive finance.

Other Abstract





Training classes
Enrollment Year
Year of Degree Awarded
References List

[1] ZHOU W, ARNER D W, BUCKLEY R P. Regulating FinTech in China: From Permissive to Balanced [J]. 2017.
[2] STEM C, MAKINEN M, QIAN Z. FinTechs in China: With a Special Focus on Peer to Peer Lending; proceedings of the International Monetary Review, October 2018, Vol5 , No4, F, 2018 [C].
[3] KPMG. Pulse of FinTech 2018 [R]: KPMG, 2018.
[4] KPMG. Pulse of FinTech 2020H2 [R]: KPMG, 2020.
[5] Indonesian Financial Literacy National Strategy (Revisit 2017) - Guidelines for Implementing Literacy and Inclusion Activities. [R]: Otoritas Jasa Keuangan, 2017.
[6] SALAMPASIS D, MENTION A L. FinTech: Harnessing Innovation for Financial Inclusion [J]. Handbook of Blockchain, Digital Finance, and Inclusion, Volume 2, 2018: 451-61.
[7] ZAVOLOKINA L, DOLATA M, SCHWABE G. FinTech – What's in a Name?; proceedings of the Thirty Seventh International Conference on Information Systems, F, 2016 [C].
[8] CUMMINS M, LYNN T, BHAIRD C, et al. Addressing Information Asymmetries in Online Peer-to-Peer Lending: FinTech and Strategy in the 21st Century [M]. Disrupting Finance, 2019.
[9] STIGLITZ J E, WEISS A. Credit Rationing in Markets with Imperfect Information [J]. Social Science Electronic Publishing.
[10] SERRANO-CINCA C, GUTIÉRREZ-NIETO B, LÓPEZ-PALACIOS L. Determinants of Default in P2P Lending [J]. PLOS ONE, 2015.
[11] BACHMANN A, BECKER A, BUERCKNER D, et al. Online Peer-to-Peer Lending – A Literature Review [J]. Journal of Internet Banking & Commerce, 2011, 16(2).
[12] SURYONO R R, BUDI I, PURWANDARI B. Detection of fintech P2P lending issues in Indonesia [J]. Heliyon, 2021, 7(4): e06782.
[13] LIN M, PRABHALA N, VISWANATHAN S. Judging Borrowers by the Company They Keep: Friendship Networks and Information Asymmetry in Online Peer-to-Peer Lending [J]. Social Science Electronic Publishing.
[14] HALES M G. Focusing on 15% of the pie [J]. bank marketing, 1995.
[15] EMEKTER R, TU Y, JIRASAKULDECH B, et al. Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending [J]. Applied Economics, 2015, 47(1-3): 54-70.
[16] ARYA S, ECKEL C, WICHMAN C. Anatomy of the Credit Score [J]. Journal of Economic Behavior & Organization, 2013, 95: 175-85.
[17] POKORNÁ M, SPONER M. Social Lending and Its Risks [J]. Procedia - Social and Behavioral Sciences, 2016, 220: 330-7.
[18] CHEN X, DING X Y, WANG B F. A Study of the Overdue Behaviors in Private Borrowing——Empirical Analysis Based on P2P Network Borrowing and Lending [J]. Finance Forum, 2013.
[19] ZHANGA Y, CHIA G, ZHANGA Z. Decision tree for credit scoring and discovery of significant features: an empirical analysis based on Chinese microfinance for farmers [J].
[20] GONZALEZ L, LOUREIRO Y K. When can a photo increase credit? The impact of lender and borrower profiles on online peer-to-peer loans [J]. Journal of Behavioral and Experimental Finance, 2014, 2: 44-58.
[21] MA L, ZHAO X, ZHOU Z, et al. A new aspect on P2P online lending default prediction using meta-level phone usage data in China [J]. Decision Support Systems, 2018, 111: 60-71.
[22] BAHRAMMIRZAEE A. A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems [J]. Neural Computing and Applications, 2010, 19(8): 1165-95.
[23] VEDALA R, KUMAR B R. An application of Naive Bayes classification for credit scoring in e-lending platform; proceedings of the 2012 International Conference on Data Science & Engineering (ICDSE), F, 2012 [C].
[24] BYANJANKAR A, HEIKKILÄ M, MEZEI J. Predicting Credit Risk in Peer-to-Peer Lending: A Neural Network Approach; proceedings of the Computational Intelligence, IEEE Symposium, F, 2015 [C].
[25] JIANG J, LIAO L, WANG Z, et al. Government Affiliation and Fintech Industry: The Peer-to-Peer Lending Platforms in China [J]. Social Science Electronic Publishing, 2018.
[26] MALEKIPIR BA ZARI M, AKSAKALLI V. Risk Assessment in Social Lending via Random Forests [J]. Expert Systems with Applications, 2015.
[27] PACELLI V, AZZOLLINI M. An Artificial Neural Network Approach for Credit Risk Management [J]. Journal of Intelligent Learning Systems and Applications, 2011, 3(2): 103-12.
[28] YU J, ZHU Y. A Data-Driven Approach to Predict Default Risk of Loan for Online Peer-to-Peer (P2P) Lending; proceedings of the Fifth International Conference on Communication Systems & Network Technologies, F, 2015 [C].
[29] HAMADANI A Z, SHALBAFZADEH A, REZVAN T, et al. An Integrated Genetic-Based Model of Naive Bayes Networks for Credit Scoring [J]. International Journal of Artificial Intelligence & Applications, 2013, 4(1): 85-103.
[30] BYANJANKAR A, HEIKKILA¨ M. Credit Risk Evaluation in Peer-to-peer Lending With Linguistic Data Transformation and Supervised Learning; proceedings of the 51st Hawaii International Conference on System Sciences HICSS-51, F, 2018 [C].
[31] GUTIERREZ-NIETO B, SERRANO-CINCA C, CAMON-CALA J. A Credit Score System for Socially Responsible Lending [J]. Journal of Business Ethics, 2016, 133(4): 691-701.
[32] ZDEMIR Z, BORAN L. An Empirical Investigation on Consumer Credit Default Risk [J]. Working Papers, 2004.
[33] 胡毅, 王珏, 杨晓光. 基于面板Logit模型的银行客户贷款违约风险预警研究 [J]. 系统工程理论与实践, 2015, 35(7): 8.
[34] SUNDSOY P R, BJELLAND J, REME B A, et al. Deep Learning Applied to Mobile Phone Data for Individual Income Classification; proceedings of the International conference on Artificial Intelligence: Technologies and Applications, F, 2016 [C].
[35] BARNETT W, CHAUVET M, LEIVA-LEON D, et al. Nowcasting nominal gdp with the credit-card augmented Divisia monetary aggregates [J]. MPRA Paper, 2016.
[36] DUARTE C, RODRIGUES P, RUA A. A mixed frequency approach to the forecasting of private consumption with ATM/POS data [J]. International Journal of Forecasting, 2017, 33(1): 61-75.
[37] CARLSEN M, STORGAARD P E. Dankort payments as a timely indicator of retail sales in Denmark [J]. Danmarks Nationalbank Copenhagen, 2010.
[38] LIKAMWA R, LIU Y, LANE N D, et al. MoodScope: building a mood sensor from smartphone usage patterns [J]. 2012.
[39] PIELOT M, DINGLER T, SAN J, et al. When Attention is not Scarce - Detecting Boredom from Mobile Phone Usage; proceedings of the UbiComp '15: ACM International Joint Conference on Pervasive and Ubiquitous Computing, F, 2015 [C].
[40] SHEN J, BRDICZKA O, LIU J. A study of Facebook behavior: What does it tell about your Neuroticism and Extraversion? [J]. Computers in Human Behavior, 2015, 45(45): 32-8.
[41] ZHAO S, RAMOS J, TAO J, et al. Discovering different kinds of smartphone users through their application usage behaviors; proceedings of the Acm International Joint Conference on Pervasive & Ubiquitous Computing, F, 2016 [C].
[42] BERNERTH J B, TAYLOR S G, WALKER H J, et al. An empirical investigation of dispositional antecedents and performance-related outcomes of credit scores [J]. Journal of Applied Psychology, 2012, 97(2): 469-78.
[43] POKHRIYAL N, JACQUES D C. Combining disparate data sources for improved poverty prediction and mapping [J]. Proceedings of the National Academy of Sciences of the United States of America, 2017: E9783-E92.
[44] J., BLUMENSTOCK, G., et al. Predicting poverty and wealth from mobile phone metadata [J]. Science, 2015, 350(6264).
[45] LIM S S. Oxford Handbook of Mobile Communication and Society [M]. Oxford Handbook of Mobile Communication and Society, 2020.

Academic Degree Assessment Sub committee
Domestic book classification number
Data Source
Document TypeThesis
DepartmentDepartment of Information Systems and Management Engineering
Recommended Citation
GB/T 7714
Peng XR. Research on the Factors that Influence P2P Loan Default: Analysis Based on Logistic Regression and Machine Learning[D]. 深圳. 南方科技大学,2022.
Files in This Item:
File Name/Size DocType Version Access License
12032752-彭筱茹-商学院.pdf(3885KB) Restricted Access--Fulltext Requests
Related Services
Recommend this item
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[彭筱茹]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[彭筱茹]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[彭筱茹]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.