TY  - JOUR
AU  - Sharma, Vishal 
AU  - Rishu, 
AU  - Kukreja, Vinay 
AU  - Dogra, Ayush 
AU  - Goyal, Bhawna 
PY  - 2025
TI  - Transforming Retinal Diagnostics: Advanced Detection of Diabetic Retinopathy Using Vision Transformers and Capsule Networks
JF  - Journal of Computer Science
VL  - 21
IS  - 2
DO  - 10.3844/jcssp.2025.304.321
UR  - https://thescipub.com/abstract/jcssp.2025.304.321
AB  - Diabetic Retinopathy (DR), nowadays is one of the leading causes of blindness worldwide, it is a severe complication of diabetes mellitus that affects the retina blood vessels. Accurate diagnosis depends on early detection of DR. The study aims to develop a hybrid model that is the combination of a Vision Transformer and Capsule Network (ViT-CapsNet) to classify the DR at early stages. The ViT-CapsNet model is proposed to detect the DR from the retinal images at the early stage. The eyepieces public dataset is used. The data preprocessing takes place in which the resizing and data augmentation are used to improve the quality and increase the diversity of the data. Then, the Vision transformer extracts the global features from the retinal fundus image while the capsule network preserves the spatial relationships and hierarchies within the data, also classified into different classes that are No DR, Mild DR, Moderate DR, Severe DR and Proliferative DR. The ViT-CapsNet model has a precision, recall and F1-Score with values of 0.92, 0.91 and 0.91 respectively. The ViT-CapsNet model shows an accuracy of 94% compared to the other traditional methods such as CNN (88%), ResNet (90%), and EfficientNet (92%). The AUC-ROC scores for classes No DR, Mild DR, Moderate DR, Severe DR, and Proliferative DR are 0.56, 0.48, 0.44, 0.45, and 0.51 respectively.