Borderless Table Detection from Images
Deep learning system for detecting and extracting tables without visible borders using Table Transformer.
Overview
This project implements a borderless table detection system using Hugging Face’s Table Transformer model. The system is capable of detecting and extracting tables from images even when no visible borders are separating the rows and columns. It uses deep learning models to detect tables and their structure and visualizes the results by marking the tables, rows, and columns on the original image.
Key Features
Table Detection
Identifies and locates tables within documents, even without visible borders.
Structure Recognition
Detects internal table structure including rows, columns, and spanning cells.
Technical Approach
1. Image Preprocessing
- RGB format conversion for compatibility
- Image resizing to 50% for computational efficiency
- Feature extraction using
DetrFeatureExtractor
2. Model Architecture
The project leverages two specialized models from Hugging Face:
Table Detection Model
microsoft/table-transformer-detection
- ResNet backbone for local feature extraction
- Transformer encoder-decoder for global context
- Outputs bounding boxes around detected tables
Structure Recognition Model
microsoft/table-transformer-structure-recognition
- DETR-based architecture with multi-head attention
- Identifies rows, columns, and cell relationships
- Handles complex structures including merged cells
3. Inference Pipeline
graph LR
A[Input Image] --> B[Feature Extraction]
B --> C[Table Detection]
C --> D[Structure Recognition]
D --> E[Post-processing]
E --> F[Visualization]
Tools & Technologies
Results & Visualization
The system visualizes detection results by:
- Drawing bounding boxes around detected tables
- Highlighting individual rows and columns
- Using different colors to distinguish between structural elements
- Providing confidence scores for each detection
Key Advantages
Borderless Detection
Works on tables without visible borders or grid lines
State-of-the-Art Models
Leverages pre-trained transformer models for high accuracy
Quick Implementation
Ready-to-use solution without extensive custom training
Limitations & Future Work
- Computational Resources: Can be resource-intensive for large-scale deployment
- Domain Adaptation: May require fine-tuning for specialized table layouts
- Complex Structures: Performance can vary with highly irregular table formats
How to Run
- Clone the Repository:
git clone https://github.com/ShakilMahmudShuvo/Borderless-Tables-Detection.git - Install Dependencies:
pip install -r requirements.txt - Run the Notebook:
- Open
borderless_table_detection.ipynb - Execute cells sequentially
- Use the Inference section for custom images
- Open