Borderless Table Detection from Images

Deep learning system for detecting and extracting tables without visible borders using Table Transformer.


Overview

This project implements a borderless table detection system using Hugging Face’s Table Transformer model. The system is capable of detecting and extracting tables from images even when no visible borders are separating the rows and columns. It uses deep learning models to detect tables and their structure and visualizes the results by marking the tables, rows, and columns on the original image.

Key Features

Table Detection

Identifies and locates tables within documents, even without visible borders.

Structure Recognition

Detects internal table structure including rows, columns, and spanning cells.

Technical Approach

1. Image Preprocessing

  • RGB format conversion for compatibility
  • Image resizing to 50% for computational efficiency
  • Feature extraction using DetrFeatureExtractor

2. Model Architecture

The project leverages two specialized models from Hugging Face:

Table Detection Model

microsoft/table-transformer-detection

  • ResNet backbone for local feature extraction
  • Transformer encoder-decoder for global context
  • Outputs bounding boxes around detected tables
Structure Recognition Model

microsoft/table-transformer-structure-recognition

  • DETR-based architecture with multi-head attention
  • Identifies rows, columns, and cell relationships
  • Handles complex structures including merged cells

3. Inference Pipeline

graph LR
    A[Input Image] --> B[Feature Extraction]
    B --> C[Table Detection]
    C --> D[Structure Recognition]
    D --> E[Post-processing]
    E --> F[Visualization]

Tools & Technologies

Python PyTorch Hugging Face Transformers Pillow (PIL) OpenCV Jupyter Notebook

Results & Visualization

The system visualizes detection results by:

  • Drawing bounding boxes around detected tables
  • Highlighting individual rows and columns
  • Using different colors to distinguish between structural elements
  • Providing confidence scores for each detection

Key Advantages

Borderless Detection

Works on tables without visible borders or grid lines

State-of-the-Art Models

Leverages pre-trained transformer models for high accuracy

Quick Implementation

Ready-to-use solution without extensive custom training

Limitations & Future Work

  • Computational Resources: Can be resource-intensive for large-scale deployment
  • Domain Adaptation: May require fine-tuning for specialized table layouts
  • Complex Structures: Performance can vary with highly irregular table formats

How to Run

  1. Clone the Repository:
    git clone https://github.com/ShakilMahmudShuvo/Borderless-Tables-Detection.git
    
  2. Install Dependencies:
    pip install -r requirements.txt
    
  3. Run the Notebook:
    • Open borderless_table_detection.ipynb
    • Execute cells sequentially
    • Use the Inference section for custom images