Retinal Disease Classification using OCT Images

Hybrid CNN architectures with attention mechanisms for multi-class retinal disease identification.


Overview

This research presents a novel approach to automated retinal disease classification using Optical Coherence Tomography (OCT) images. By incorporating Convolutional Block Attention Module (CBAM) and skip connections into pre-trained CNNs, we achieved state-of-the-art performance in identifying eight different retinal conditions.

Published at the 3rd International Conference on Computing Advancements (ICCA 2024), this work addresses the critical need for automated diagnostic tools in ophthalmology, potentially preventing blindness through early disease detection.

Tools & Technologies

Python TensorFlow Keras OpenCV NumPy Pandas Matplotlib CBAM DenseNet121 Xception


Research Motivation

Millions of people worldwide suffer from retinal diseases that, if left untreated, can lead to preventable blindness. Traditional manual diagnosis methods are:

  • Time-consuming and labor-intensive
  • Subject to inter-observer variability
  • Limited by the availability of specialized ophthalmologists
  • Prone to human error in early-stage disease detection

Our automated approach using deep learning aims to overcome these limitations by providing fast, consistent, and accurate diagnoses.


Technical Approach

1. Hybrid Architecture Design

We developed a novel architecture combining three key components:

Pre-trained CNNs

Leveraged transfer learning with DenseNet121, ResNet50, VGG16, Xception, and EfficientNetB1

CBAM Module

Enhanced feature extraction by focusing on pathologically significant regions in OCT images

Skip Connections

Facilitated direct information flow between layers, preserving fine-grained features

2. Dataset: OCT-C8

The study utilized the comprehensive OCT-C8 dataset containing:

  • 8 Classes: 7 disease types + 1 healthy class
  • Diseases Covered:
    • Age-related Macular Degeneration (AMD)
    • Diabetic Macular Edema (DME)
    • Epiretinal Membrane (ERM)
    • Macular Hole (MH)
    • Retinal Artery Occlusion (RAO)
    • Retinal Vein Occlusion (RVO)
    • Vitreomacular Traction (VMT)
    • Normal retina

3. Model Architecture

# Simplified architecture overview
Input OCT Image  Pre-trained CNN Base  CBAM Module  Skip Connection  
Dense Layers  Softmax Classification  Disease Prediction

Results & Performance

96.28%

DenseNet-CBAM-Skip Accuracy

96.11%

Xception-CBAM-Skip Accuracy

96.31%

F1-Score (DenseNet)

8

Classes Identified

Performance Comparison

Our hybrid models significantly outperformed baseline pre-trained models:

  • DenseNet121 (baseline): 91.2% → DenseNet-CBAM-Skip: 96.28% (+5.08%)
  • Xception (baseline): 90.8% → Xception-CBAM-Skip: 96.11% (+5.31%)

Key Innovations

  1. Attention-Enhanced Feature Extraction: CBAM module enables the model to focus on disease-specific regions in OCT images
  2. Information Preservation: Skip connections prevent loss of fine-grained details crucial for accurate diagnosis
  3. Efficient Transfer Learning: Leveraged pre-trained models while adapting them specifically for retinal disease patterns
  4. Multi-Architecture Validation: Tested across multiple CNN architectures to ensure robustness

Clinical Impact

This research contributes to:

  • Early Disease Detection: Automated screening can identify diseases in early stages
  • Healthcare Accessibility: Reduces dependency on specialized ophthalmologists in remote areas
  • Consistent Diagnoses: Eliminates inter-observer variability
  • Rapid Processing: Real-time classification enables immediate clinical decisions

Implementation Details

Model Training

  • Optimizer: Adam with learning rate scheduling
  • Loss Function: Categorical cross-entropy
  • Data Augmentation: Rotation, flipping, zooming to improve generalization
  • Validation Strategy: 5-fold cross-validation

Hardware Requirements

  • GPU: NVIDIA Tesla V100 or equivalent
  • RAM: Minimum 16GB
  • Storage: ~10GB for dataset and models

Future Work

  • Integration with clinical imaging systems for real-time diagnosis
  • Extension to other imaging modalities (fundus photography, fluorescein angiography)
  • Development of explainable AI features for clinical interpretation
  • Mobile deployment for point-of-care diagnostics

Citation

If you use this work in your research, please cite:

@inproceedings{novely2024improving,
  title={Improving Pre-Trained CNNs with CBAM and Skip Connections for 
         Multi-Class Retinal Diseases Classification using OCT Images},
  author={Novely, Navia and Mahmud Shuvo, Shakil and Faruk, Md Farukuzzaman},
  booktitle={Proceedings of the 3rd International Conference on Computing Advancements},
  pages={946--953},
  year={2024}
}