Abstract
The credit scoring model is one of the most important decision-making tools for the sustainability of banking systems. This study is the first to examine whether it can be improved by using system log data that are stoed extensively for system operation. We used the log data recorded by the mobile application system of KakaoBank, a leading internet bank used by more than 14 million people in Korea. After generating candidate variables from KakaoBank’s log data, we created a credit scoring model by utilizing variables with high information values and logistic regression, the most common method for developing credit scoring models in financial institutions. To prove our hypothesis on the improvement of credit scoring model performance, we performed an independent sample t-test using the simulation results of repeated model development and performance measurement based on randomly sampled data. Consequently, the discrimination power of the proposed model using logistic regression (neural network) compared to the credit bureau-based model significantly improved by 1.84 (2.22) percentage points based on the Kolmogorov–Smirnov statistics. The results of this study suggest that a bank can utilize the accumulated log data inside the bank to improve decision-making systems, including credit scoring, at a low cost.
카카오뱅크 금융기술연구소
Financial Tech Lab