Presentation

READ (R Ecognition of writing and Analysis of Documents), started in 1993, is a  LORIA (UMR 7503) team

Objective
In the READ team, we tackle several problems related to the segmentation and content analysis and recognition of documents images. The challenge is the ability to understand, exploit the information content as well as to index documents in the appropriate forms that are guided by the applications. On the whole, our research themes are related to (but not limited to):
  • Document structure modeling
    • Application to invoice analysis : table detection and extraction
    • Use of graph representation and matching
  • Document segmentation: line detection, baseline extraction, word separation, Printed-Handwritten separation
    • Application to form analysis, table detection and table extraction
    • Use of rule-based systems, case-base reasoning
  • Document clustering
    • Application to document flow separation
    • Use of incremental and active learning, semi-supervised learning
  • Document learning
    • use of deep learning, data augmentation, intelligent annotation
    • application to historical document analysis