Temporal Windowed and Internal Feature (TWIF) Transformer for Attack Detection in Robotics

YOLAÇAN, ESRA; Zaim, Hande

doi:10.1109/access.2026.3674720

Temporal Windowed and Internal Feature (TWIF) Transformer for Attack Detection in Robotics

YOLAÇAN E. N., Zaim H. C.

IEEE ACCESS, cilt.14, ss.42674-42690, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 14
Basım Tarihi: 2026
Doi Numarası: 10.1109/access.2026.3674720
Dergi Adı: IEEE ACCESS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.42674-42690
Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

Ensuring cybersecurity in robotic systems is critically important, as successful attacks can not only disrupt operations but also cause significant physical damage and safety risks, distinguishing robotics from traditional Information Technology environments. The superior accuracy of Transformer models compared to traditional learning approaches has made them a prominent and rapidly growing research focus in cybersecurity. In this study, we propose a novel Transformer-based model, called Temporal windowed & internal feature (TWIF) transformer, an enhanced version of the FPE-Transformer specifically adapted for use in the robotics domain. By embedding a novel sliding-window temporal attention mechanism into the encoder, the TWIF-Transformer enhances sensitivity to short-term temporal patterns and supports reliable, real-time attack detection in robotic systems. The proposed model is compared with FPE-Transformer, PE-Transformer, and TWIF with PE-Transformer, and consistently outperforms them in terms of accuracy and binary cross-entropy loss. TWIF-Transformer improves accuracy to 0.934 under subscriber flood scenarios and to 0.732 under replay attacks. Achieving this level of performance even against replay attacks, which are stealthy and non-volumetric, demonstrates the effectiveness of the proposed model. In addition to testing on our robotic system datasets, the model is also validated using well-known network intrusion datasets and reaches accuracies of 0.927 on CICIDS2017 and 0.903 on NSL-KDD, demonstrating its generalization ability beyond robotics. Furthermore, inference-time analysis confirms that the model meets ROS-based control-loop timing requirements, supporting its suitability for real-time robotic intrusion detection. These results highlight the model's robustness, effectiveness, and suitability for real-time intrusion detection, particularly in security-critical robotic environments.