The 16th International Conference on Document Analysis and Recognition (ICDAR 2021) will take place in Lausanne between the 5th and the 10th of September.
ARIADNEXT will take the opportunity to present its last technical advances in the field of document analysis.
Identity Document Detection, Classification and Cropping1
One of the main steps of ID document analysis is to localize the ID document in the image sent by the user and to identify its type (French ID card, German passport, etc…).
This task is still very challenging due to the high variation of capture conditions, on the one hand. As the user simply takes a picture of his/her document without any constraints, produced images may suffer from blur, illumination variations, resolution, etc. On the other hand, a large panel of documents must be considered to have a wide coverage of in-use identity documents.
A modular framework based on a fully multi-stage deep learning based approach is proposed in this work. It offers more flexibility in the classification pipeline in addition to a potential future incremental learning.
Experiments show the superiority of the proposed approach in terms of speed while maintaining good accuracy, both on academic and industrial datasets compared to hand crafted solutions.
A more robust solution for text recognition2
Optical character recognition systems allow to extract textual information from document image. Modern character recognition systems, mostly based on recurrent neural networks, are very efficient, but also very sensitive to text localization variations due to various capture conditions.
In this work, first we show the sensitivity of recurrent networks to such variations. Data augmentation is first proposed to overcome this issue. Despite the improvement using augmented data during the training stages, this approach is less efficient in terms of training time and required storage.
A new fully convolutional neural network is then proposed. In addition to be more resilient than the state-of-the-art systems, this architecture is more compact and offers a lighter and more efficient alternative than recurrent networks.
Image similarity measure to detect forged documents3
Some of the document verifications to detect forged documents require the use of image comparison methods (ex. invariant parts of the background).
Neural architectures have been explored in this work to measure the similarity between two images. The use of an adapted loss function allows calculating a distance that would be small for similar images (=genuine document) and high for ‘different’ images (=forged document).
An experimental comparison between two deep architectures and traditional approaches based on handcrafted features has been conducted on a real-world dataset of patches extracted from identity documents. The obtained results show that our approach outperforms handcrafted features based methods.
1: Guillaume Chiron, Florian Arrestier and Ahmad Montaser Awal, Fast End-to-end Deep Learning Identity Document Detection, Classification and Cropping, International conference on document analysis and recognition (ICDAR 2021), 2021
2: Ahmad Montaser Awal, Timothée Neitthoffer and Nabil Ghanmi. Data augmentation vs. PyraD-DCNN: a Fast, Light, and Shift Invariant FCNN for Text Recognition. 3rd Workshop on Machine Learning in ICDAR 2021.
3: Nabil Ghanmi, Cyrine Nabli and Ahmad Montaser Awal. CheckSim: a reference-based identity document verification by image similarity measure. 3rd International Workshop on Computational Document Forensics in ICDAR 2021.