| نویسندگان | Hamid Saadatfar,Majid Chahkandi,Hamide badi,Maryam Esna-Ashari |
| نشریه | Journal of Mathematics and Modeling in Finance |
| شماره صفحات | 175-187 |
| شماره سریال | 5 |
| شماره مجلد | 1 |
| نوع مقاله | Full Paper |
| تاریخ انتشار | 2025 |
| نوع نشریه | الکترونیکی |
| کشور محل چاپ | ایران |
| نمایه نشریه | isc،Scopus |
چکیده مقاله
Accurate prediction of third-party insurance claims is critical for pricing policies and managing risk. However, the highly imbalanced nature of insurance data| where non-claim cases vastly outnumber claim case|poses signicant challenges to standard predictive models. This study explores the use of machine learning algorithms to enhance claim prediction by directly addressing this imbalance. We use real data from the Insurance Research Center of Iran, incorporating variables such as driver characteristics, vehicle features, location, and claims history. Five
models are evaluated: logistic regression, decision tree, bagging, random forest, and boosting. To handle the imbalance, we apply random under-sampling, over-sampling, and SMOTE. Model performance is assessed using accuracy, sensitivity, specificity, precision, and F-score. Results indicate that when data imbalance is properly treated, ensemble method|particularly decision trees, bagging, and random fores-significantly outperform logistic regression and boosting, especially in detecting actual claim cases. The study underscores the importance of using appropriate resampling techniques and evaluation metrics in imbalanced settings. These findings can help insurers develop more reliable models for pricing and risk classification.
لینک ثابت مقاله