CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images
Title: CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images
Authors : Madhav Agarwal, Ajoy Mondal and C. V. Jawahar
Abstract
Localizing page elements/objects such as tables, figures, equations, etc. is the primary
step in extracting information from document images. We propose a novel end-to-end trainable
deep network, (CDeC-Net) for detecting tables present in the documents. The proposed network
consists of a multistage extension of Mask R-CNN with a dual backbone having deformable convolution
for detecting tables varying in scale with high detection accuracy at higher IoU threshold. We
empirically evaluate CDeC-Net on all the publicly available benchmark datasets — ICDAR-2013,
ICDAR-2017, ICDAR-2019, UNLV, Marmot, PubLayNet, and TableBank — with extensive experiments.
Our solution has three important properties: (i) a single trained model CDeC-Net‡ performs
well across all the popular benchmark datasets; (ii) we report excellent performances across
multiple, including higher, thresholds of IoU ; (iii) by following the same protocol of the
recent papers for each of the benchmarks, we consistently demonstrate the superior quantitative
performance. Our code and models will be publicly released for enabling the reproducibility
of the results.
Keywords: Page object, table detection, Cascade Mask R-CNN, deformable convolution, single model.
CDeC-Net : Composite Deformable Cascade Network
Fig. 1: Illustration of the proposed CDeC-Net which is compose of cascade Mask R-CNN with composite backbone having deformable convolution instead of conventional convolution.
Fig. 2: Illustration of the deformable convolution.
Cite this paper as:
@article{agarwal2020cdec,
title={CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images},
author={Agarwal, Madhav and Mondal, Ajoy and Jawahar, C V},
journal={arXiv},
year={2020}
}
Team
- Madhav Agarwal
- Dr. Ajoy Mondal
- Prof. C. V. Jawahar