ML Credit Fraud Analysis

Credit card fraud poses significant financial risks, with projected global losses estimated to reach $43 billion by 2026; meaning preventing and addressing credit fraud is vital for safeguarding both individuals and the global economy. Seeing that one bank "struggled with a low 40 percent fraud detection rate and was managing up to 1,200 false positives per day—and 99.5 percent of all cases the bank was investigating were not fraud related." proved just how important and costly this problem is.

This lead me to investigate and undertake a personal project into how different machine learning models can perform at identifying credit fraud. Aiming to build a selection of models and assess their performance and accuracy at identifying these cases.

Project Repo:

Project Process:

The data set was taken from then Bank Account Fraud (BAF) suite of datasets that were published at NeurIPS 2022. Comprised of 6 different synthetic bank account fraud tabular datasets. This was then cleaned following the steps shown, in order to maximise model performance. Optimal features were selected aiming to describe as much of the dataset as possible without overfitting and finally the individual models built. A comparison between the models could then be conducted using the 4 methods below. For more detail please take a look at the project on my Github.

Model Assessment:

Classification Report

A machine learning classification report is a summary that evaluates how accurately a model categorises data, indicating its precision, recall, and overall effectiveness.

Confusion Matrix

Summarises the model's performance, showing the number of correct and incorrect predictions for each class, aiding in understanding the model's strengths and weaknesses.

ROC Curve

Graphical representation illustrating the trade-off between a model's true positive rate and false positive rate, helping to assess and compare the overall performance.

Comparative ROC Curve

Combing each model's ROC graph to enable the comparison of performance between the four models. It is evident here that the Random Forrest model build in this exercise performed the best, slightly outperforming XGBoost.

Outcome:

The Random Forest model emerged as the most effective performer, demonstrating a slight edge over XGBoost—assumed due to hyper-parameter selection, considering the latter is rooted in the Random Forest model framework. Operating on an imbalanced dataset, the model exhibited a commendable precision rate of 80%, successfully identifying 2612 out of 3278 fraudulent cases. Subsequently, the model was packaged for reuse in the ShapGPT exercise that followed.