Graph-Based Fraud Detection with Optimized Features and Class Balance

Anisa Nur Azizah; Alven Safik Ritonga; Suryo Atmojo; Nurwahyudi Widhiyanta; Suzana Dewi; M Harist Murdani; Mamik Usniyah Sari

doi:10.61628/jsce.v6i3.2001

Anisa Nur Azizah Universitas Wijaya Putra
Alven Safik Ritonga Universitas Wijaya Putra
Suryo Atmojo Universitas Wijaya Putra
Nurwahyudi Widhiyanta Universitas Wijaya Putra
Suzana Dewi Universitas Wijaya Putra
M Harist Murdani Universitas Wijaya Putra
Mamik Usniyah Sari Universitas Wijaya Putra

DOI: https://doi.org/10.61628/jsce.v6i3.2001

Keywords: Fraud Detection, GNN, Feature Selection, Class Balance

Abstract

The increasing use of digital transactions also elevates the risk of fraud, particularly in credit card transactions. Fraud detection poses a challenge due to the highly imbalanced nature of the data and the complexity of relationships among entities. This study proposes a GNN-based approach, integrated with feature selection techniques and class imbalance handling through class weighting based on data distribution. Feature selection was performed using two methods: Correlation-based Feature Selection (CFS) and Random Forest Feature Importance, to obtain the most relevant features. Experimental results show that the combination of Random Forest feature selection and class weighting yielded the highest F1 Score, despite a slight decrease in accuracy. This indicates that feature selection and class weighting strategies can improve the model's ability to detect rare fraudulent transactions. This approach contributes to the development of more accurate and adaptive fraud detection systems in digital transaction environments.

References

Aghware, F. O., Ojugo, A. A., Adigwe, W., Odiakaose, C. C., Ojei, E. O., Ashioba, N. C., Okpor, M. D., & Geteloma, V. O. (2024). Enhancing the random forest model via synthetic minority oversampling technique for credit-card fraud detection. Journal of Computing Theories and Applications, 1(4), 407–420.

Bank Indonesia. (2024). Blueprint Sistem Pembayaran Indonesia 2030 Bank Indonesia: Mengakselerasi Ekonomi Digital Nasional untuk Generasi Mendatang. Bspi 2030. https://www.bi.go.id/id/publikasi/kajian/Documents/Blueprint-Sistem-Pembayaran-Indonesia-2030.pdf

Billah, K. S. (2024). DETEKSI PENIPUAN KARTU KREDIT MENGGUNAKAN METODE RANDOM FOREST. JOISIE (Journal Of Information Systems And Informatics Engineering), 8(2), 200–208.

Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

Cherif, A., Ammar, H., Kalkatawi, M., Alshehri, S., & Imine, A. (2024). Encoder-decoder graph neural network for credit card fraud detection. Journal of King Saud University-Computer and Information Sciences, 36(3), 102003.

Dublin. (2025). Credit Card Issuance Services Market Trends, Strategies and Growth Opportunities 2025-2029 & 2034 - Global Revenues to Exceed $787 Billion by 2029. GLOBE NEWSWIRE. https://www.globenewswire.com/news-release/2025/03/24/3047674/28124/en/Credit-Card-Issuance-Services-Market-Trends-Strategies-and-Growth-Opportunities-2025-2029-2034-Global-Revenues-to-Exceed-787-Billion-by-2029.html

Farida, F., & Mustopa, A. (2023). Comparison of logistic regression and random forest using correlation-based feature selection for phishing website detection. Sistemasi: Jurnal Sistem Informasi, 12(1), 13–20.

Gopika, N., & ME, A. M. K. (2018). Correlation based feature selection algorithm for machine learning. 2018 3rd International Conference on Communication and Electronics Systems (ICCES), 692–695.

Husnaningtyas, N., & Dewayanto, T. (2023). FINANCIAL FRAUD DETECTION AND MACHINE LEARNING ALGORITHM (UNSUPERVISED LEARNING): SYSTEMATIC LITERATURE REVIEW. Jurnal Riset Akuntansi Dan Bisnis Airlangga (JRABA), 8(2).

Ileberi, E., Sun, Y., & Wang, Z. (2022). A machine learning based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9(1), 24.

Istiqamah, N., & Rijal, M. (2024). Klasifikasi Ulasan Konsumen Menggunakan Random Forest dan SMOTE. Journal of System and Computer Engineering, 5(1), 66–77.

Kartik Shenoy. (2020). Credit Card Transactions Fraud Detection Dataset. Kaggle Dataset. https://www.kaggle.com/datasets/kartik2112/fraud-detection/data?select=fraudTrain.csv

Khanum, A., Chaitra, K. S., Singh, B., & Gomathi, C. (2024). Fraud Detection in Financial Transactions: A Machine Learning Approach vs. Rule-Based Systems. 2024 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), 1–5.

Kim, M., Lee, J., & Kim, J. (2023). GMR-Net: GCN-based mesh refinement framework for elliptic PDE problems. Engineering with Computers, 39(5), 3721–3737.

Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. ArXiv Preprint ArXiv:1609.02907.

Li, R., Liu, Z., Ma, Y., Yang, D., & Sun, S. (2022). Internet financial fraud detection based on graph learning. IEEE Transactions on Computational Social Systems, 10(3), 1394–1401.

Liu, X., Chen, J., & Wen, Q. (2023). A survey on graph classification and link prediction based on gnn. ArXiv Preprint ArXiv:2307.00865.

Mao, X., Sun, H., Zhu, X., & Li, J. (2022). Financial fraud detection using the related-party transaction knowledge graph. Procedia Computer Science, 199, 733–740.

Mienye, I. D., & Sun, Y. (2023). A machine learning method with hybrid feature selection for improved credit card fraud detection. Applied Sciences, 13(12), 7254.

Scornet, E. (2023). Trees, forests, and impurity-based variable importance in regression. Annales de l’Institut Henri Poincare (B) Probabilites et Statistiques, 59(1), 21–52.

Statista. (2024). Value of e-commerce losses to online payment fraud worldwide in 2023 and 2024, with forecasts for 2029. Statista 2025. https://www.statista.com/statistics/1273177/ecommerce-payment-fraud-losses-globally/

Subramaniam, D. N., Jeyananthan, P., & Sathiparan, N. (2024). Soft computing techniques to predict the electrical resistivity of pervious concrete. Asian Journal of Civil Engineering, 25(1), 711–722.

Tang, Y., & Liang, Y. (2024). Credit card fraud detection based on federated graph learning. Expert Systems with Applications, 256, 124979.

Vrahatis, A. G., Lazaros, K., & Kotsiantis, S. (2024). Graph attention networks: a comprehensive review of methods and applications. Future Internet, 16(9), 318.

Zhang, S., Tong, H., Xu, J., & Maciejewski, R. (2019). Graph convolutional networks: a comprehensive review. Computational Social Networks, 6(1), 1–23.

Zhang, T., Shan, H.-R., & Little, M. A. (2022). Causal GraphSAGE: A robust graph method for classification based on causal sampling. Pattern Recognition, 128, 108696.

Zhao, T., Zhang, X., & Wang, S. (2021). Graphsmote: Imbalanced node classification on graphs with graph neural networks. Proceedings of the 14th ACM International Conference on Web Search and Data Mining, 833–841.