In the READ team, we tackle several problems related to the segmentation and content analysis and recognition
of documents images. The challenge is the ability to understand, exploit the information content as well as to index
documents in the appropriate forms that are guided by the applications. On the whole, our research themes are related
to (but not limited to):
- Document structure modeling
- Application to invoice analysis : table detection and extraction
- Use of graph representation and matching
- Document segmentation: line detection, baseline extraction, word separation, Printed-Handwritten separation
- Application to form analysis, table detection and table extraction
- Use of rule-based systems, case-base reasoning
- Document clustering
- Application to document flow separation
- Use of incremental and active learning, semi-supervised learning
- Document learning
- use of deep learning, data augmentation, intelligent annotation
- application to historical document analysis
|