Attention-enhanced 3D residual networks for knee abnormality classification

Ashames, Mohamad; ERGİN, SEMİH; Gerek, Omer; YAVUZ, HASAN

doi:10.1016/j.eswa.2025.129858

Attention-enhanced 3D residual networks for knee abnormality classification

Ashames M. M., ERGİN S., Gerek O. N., YAVUZ H. S.

Expert Systems with Applications, cilt.298, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 298
Basım Tarihi: 2026
Doi Numarası: 10.1016/j.eswa.2025.129858
Dergi Adı: Expert Systems with Applications
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Public Affairs Index
Anahtar Kelimeler: 3D CNNs, Attention mechanisms, Knee abnormalities classification, Residual blocks, Spatial attention, Squeeze and excitation
Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

The advancement of deep learning technologies, particularly through Convolutional Neural Networks (CNNs), has substantially enriched medical image analysis. This study focuses on improving knee MRI diagnostics by comparing 2D and 3D CNN architectures using the MRNet and SKM-TEA datasets. Initially, modified 2D CNNs, such as ResNet50, were applied for plane-specific and integrated multi-plane analyses. Plane-specific models captured detailed anatomical features, while integrated approaches synthesized information across multiple planes, improving diagnostic capability but lacking full volumetric data utilization. To address these limitations, a novel 3D CNN architecture enhanced with residual attention blocks was developed, leveraging volumetric MRI data. These blocks integrate spatial attention and Squeeze-and-Excitation (SE) mechanisms, optimizing feature focus for accurate diagnostics. This approach improved both model precision and interpretability, which are crucial for clinical applications. Experimental evaluation on the MRNet dataset demonstrated that the proposed 3D CNN outperformed 2D models, achieving 83.58 % accuracy for abnormalities. On the SKM-TEA dataset, the model classified Meniscal Tear (71.36 %), Ligament Tear (79.84 %), Cartilage Lesion (84.28 %), and Effusion (76.74 %), demonstrating robustness in complex pathology detection. Gradient-weighted Class Activation Mapping (Grad-CAM) further enhanced interpretability by highlighting critical diagnostic regions. These findings emphasize the effectiveness of attention-guided 3D CNNs in knee abnormality classification. Future work will explore broader applications in medical imaging, refining the model’s generalizability across diverse clinical datasets.