IEEE Access, 2024 (SCI-Expanded)
The adoption of deep learning has exposed significant vulnerabilities, especially to adversarial attacks that cause misclassifications through subtle small perturbations. Such attacks challenge security-critical applications. This study addresses these vulnerabilities by proposing a novel adversarial attack detection method leveraging data reconstruction errors. We evaluate this approach against three well-known adversarial attacks - Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and Basic Iterative Method (BIM) - on Intrusion Detection Systems. Our method combines reconstruction error alongside aleatoric, epistemic, and entropy metrics to distinguish between original and adversarial samples. Experimental results show that our approach achieves a detection success rate of 92% to 100%, outperforming existing methods, particularly at low perturbation levels. This research enhances the robustness and reliability of machine learning models against adversarial threats by using effective error metrics in adversarial detection.