Deep Unrestricted Document Image Rectification

FromPapers Read on AI

Start listening View podcast show

Deep Unrestricted Document Image Rectification

FromPapers Read on AI

ratings:

Length:

36 minutes

Released:

Jul 22, 2023

Format:

Podcast episode

Description

In recent years, tremendous efforts have been made on document image rectification, but existing advanced algorithms are limited to processing restricted document images, i.e., the input images must incorporate a complete document. Once the captured image merely involves a local text region, its rectification quality is degraded and unsatisfactory. Our previously proposed DocTr, a transformer-assisted network for document image rectification, also suffers from this limitation. In this work, we present DocTr++, a novel unified framework for document image rectification, without any restrictions on the input distorted images. Our major technical improvements can be concluded in three aspects. Firstly, we upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing. Secondly, we reformulate the pixel-wise mapping relationship between the unrestricted distorted document images and the distortion-free counterparts. The obtained data is used to train our DocTr++ for unrestricted document image rectification. Thirdly, we contribute a real-world test set and metrics applicable for evaluating the rectification quality. To our best knowledge, this is the first learning-based method for the rectification of unrestricted document images. Extensive experiments are conducted, and the results demonstrate the effectiveness and superiority of our method. We hope our DocTr++ will serve as a strong baseline for generic document image rectification, prompting the further advancement and application of learning-based algorithms.

2023: Hao Feng, Shaokai Liu, Jiajun Deng, Wen-gang Zhou, Houqiang Li

https://arxiv.org/pdf/2304.08796v1.pdf

Released:

Jul 22, 2023

Format:

Podcast episode

Titles in the series (100)

Keeping you up to date with the latest trends and best performing architectures in this fast evolving field in computer science. Selecting papers by comparative results, citations and influence we educate you on the latest research. Consider supporting us on Patreon.com/PapersRead for feedback and ideas.

Skip carousel

Related podcast episodes

Skip carousel

Discover this podcast and so much more

Deep Unrestricted Document Image Rectification

Deep Unrestricted Document Image Rectification

Description

Titles in the series (100)

More Episodes from Papers Read on AI

Related podcast episodes