ENHANCING MINORITY CLASS DETECTION IN INTRUSION DETECTION SYSTEMS USING GAN-BASED DATA AUGMENTATION: A HEURISTIC STUDY


Creative Commons License

Kılıç A., Özçelik I.

4. INTERNATIONAL PARIS CONGRESS ON APPLIED SCIENCES, Paris, France, 5 - 09 February 2025, pp.71-77, (Full Text)

  • Publication Type: Conference Paper / Full Text
  • City: Paris
  • Country: France
  • Page Numbers: pp.71-77
  • Eskisehir Osmangazi University Affiliated: Yes

Abstract

When classifying imbalanced datasets, the classification performance of minority classes is often low. In this study, the effect of the Generative Adversarial Network (GAN) data augmentation method on the minority classes, which exhibit poor performance in intrusion detection systems (IDS), a significant problem in the field of cybersecurity, has been investigated. The inability to detect minority class attacks can lead to serious security vulnerabilities. The main objective of the study is to examine the change in the performance of machine learning (ML) methods, which are affected at varying levels by imbalanced datasets, after the use of GAN. In this study, data augmentation was performed using the GAN method on the UNSW-NB15 dataset, and the classification performance of Random Forest Classifier and AdaBoost methods was compared before and after data augmentation. The success of the process was analyzed using metrics such as accuracy, precision, recall, and F1-score. Following GAN-based data augmentation, significant improvements in metrics such as accuracy, recall, and F1-score were observed in the Random Forest Classifier model, which is particularly effective in high-dimensional datasets. The accuracy of the Random Forest Classifier model increased from 83.57% to 98.43%, corresponding to an improvement of approximately 15%. AdaBoost, which generally performs better on imbalanced datasets, showed limited improvement (8% increase) in accuracy, increasing from 86.50% to 94.67 % when the imbalance was eliminated. This study is expected to provide a useful reference for understanding the impact of GAN-based data augmentation techniques on improving the detection performance of IDSs in minority classes.