Borderless Table Detection from Images

Overview

This project implements a borderless table detection system using Hugging Face’s Table Transformer model. The system is capable of detecting and extracting tables from images even when no visible borders are separating the rows and columns. It uses deep learning models to detect tables and their structure and visualizes the results by marking the tables, rows, and columns on the original image.

Key Features

Table Detection

Identifies and locates tables within documents, even without visible borders.

Structure Recognition

Detects internal table structure including rows, columns, and spanning cells.

Technical Approach

1. Image Preprocessing

RGB format conversion for compatibility
Image resizing to 50% for computational efficiency
Feature extraction using DetrFeatureExtractor

2. Model Architecture

The project leverages two specialized models from Hugging Face:

Table Detection Model

microsoft/table-transformer-detection

ResNet backbone for local feature extraction
Transformer encoder-decoder for global context
Outputs bounding boxes around detected tables

Structure Recognition Model

microsoft/table-transformer-structure-recognition

DETR-based architecture with multi-head attention
Identifies rows, columns, and cell relationships
Handles complex structures including merged cells

3. Inference Pipeline

graph LR
    A[Input Image] --> B[Feature Extraction]
    B --> C[Table Detection]
    C --> D[Structure Recognition]
    D --> E[Post-processing]
    E --> F[Visualization]

Tools & Technologies

Python PyTorch Hugging Face Transformers Pillow (PIL) OpenCV Jupyter Notebook

Results & Visualization

The system visualizes detection results by:

Drawing bounding boxes around detected tables
Highlighting individual rows and columns
Using different colors to distinguish between structural elements
Providing confidence scores for each detection

Key Advantages

Borderless Detection

Works on tables without visible borders or grid lines

State-of-the-Art Models

Leverages pre-trained transformer models for high accuracy

Quick Implementation

Ready-to-use solution without extensive custom training

Limitations & Future Work

Computational Resources: Can be resource-intensive for large-scale deployment
Domain Adaptation: May require fine-tuning for specialized table layouts
Complex Structures: Performance can vary with highly irregular table formats

How to Run

Clone the Repository:

git clone https://github.com/ShakilMahmudShuvo/Borderless-Tables-Detection.git

Install Dependencies:
```
pip install -r requirements.txt
```
Run the Notebook:
- Open borderless_table_detection.ipynb
- Execute cells sequentially
- Use the Inference section for custom images

View Source Code