Retinal Disease Classification using OCT Images
Hybrid CNN architectures with attention mechanisms for multi-class retinal disease identification.
Overview
This research presents a novel approach to automated retinal disease classification using Optical Coherence Tomography (OCT) images. By incorporating Convolutional Block Attention Module (CBAM) and skip connections into pre-trained CNNs, we achieved state-of-the-art performance in identifying eight different retinal conditions.
Published at the 3rd International Conference on Computing Advancements (ICCA 2024), this work addresses the critical need for automated diagnostic tools in ophthalmology, potentially preventing blindness through early disease detection.
Tools & Technologies
Python TensorFlow Keras OpenCV NumPy Pandas Matplotlib CBAM DenseNet121 Xception
Research Motivation
Millions of people worldwide suffer from retinal diseases that, if left untreated, can lead to preventable blindness. Traditional manual diagnosis methods are:
- Time-consuming and labor-intensive
- Subject to inter-observer variability
- Limited by the availability of specialized ophthalmologists
- Prone to human error in early-stage disease detection
Our automated approach using deep learning aims to overcome these limitations by providing fast, consistent, and accurate diagnoses.
Technical Approach
1. Hybrid Architecture Design
We developed a novel architecture combining three key components:
Pre-trained CNNs
Leveraged transfer learning with DenseNet121, ResNet50, VGG16, Xception, and EfficientNetB1
CBAM Module
Enhanced feature extraction by focusing on pathologically significant regions in OCT images
Skip Connections
Facilitated direct information flow between layers, preserving fine-grained features
2. Dataset: OCT-C8
The study utilized the comprehensive OCT-C8 dataset containing:
- 8 Classes: 7 disease types + 1 healthy class
- Diseases Covered:
- Age-related Macular Degeneration (AMD)
- Diabetic Macular Edema (DME)
- Epiretinal Membrane (ERM)
- Macular Hole (MH)
- Retinal Artery Occlusion (RAO)
- Retinal Vein Occlusion (RVO)
- Vitreomacular Traction (VMT)
- Normal retina
3. Model Architecture
# Simplified architecture overview
Input OCT Image → Pre-trained CNN Base → CBAM Module → Skip Connection →
Dense Layers → Softmax Classification → Disease Prediction
Results & Performance
96.28%
DenseNet-CBAM-Skip Accuracy
96.11%
Xception-CBAM-Skip Accuracy
96.31%
F1-Score (DenseNet)
8
Classes Identified
Performance Comparison
Our hybrid models significantly outperformed baseline pre-trained models:
- DenseNet121 (baseline): 91.2% → DenseNet-CBAM-Skip: 96.28% (+5.08%)
- Xception (baseline): 90.8% → Xception-CBAM-Skip: 96.11% (+5.31%)
Key Innovations
- Attention-Enhanced Feature Extraction: CBAM module enables the model to focus on disease-specific regions in OCT images
- Information Preservation: Skip connections prevent loss of fine-grained details crucial for accurate diagnosis
- Efficient Transfer Learning: Leveraged pre-trained models while adapting them specifically for retinal disease patterns
- Multi-Architecture Validation: Tested across multiple CNN architectures to ensure robustness
Clinical Impact
This research contributes to:
- Early Disease Detection: Automated screening can identify diseases in early stages
- Healthcare Accessibility: Reduces dependency on specialized ophthalmologists in remote areas
- Consistent Diagnoses: Eliminates inter-observer variability
- Rapid Processing: Real-time classification enables immediate clinical decisions
Implementation Details
Model Training
- Optimizer: Adam with learning rate scheduling
- Loss Function: Categorical cross-entropy
- Data Augmentation: Rotation, flipping, zooming to improve generalization
- Validation Strategy: 5-fold cross-validation
Hardware Requirements
- GPU: NVIDIA Tesla V100 or equivalent
- RAM: Minimum 16GB
- Storage: ~10GB for dataset and models
Future Work
- Integration with clinical imaging systems for real-time diagnosis
- Extension to other imaging modalities (fundus photography, fluorescein angiography)
- Development of explainable AI features for clinical interpretation
- Mobile deployment for point-of-care diagnostics
Citation
If you use this work in your research, please cite:
@inproceedings{novely2024improving,
title={Improving Pre-Trained CNNs with CBAM and Skip Connections for
Multi-Class Retinal Diseases Classification using OCT Images},
author={Novely, Navia and Mahmud Shuvo, Shakil and Faruk, Md Farukuzzaman},
booktitle={Proceedings of the 3rd International Conference on Computing Advancements},
pages={946--953},
year={2024}
}