|
|
It is a system for form sementation and cell classification. There are 8 different classes according to the content nature: numerical or alphabetical, to the aspect: vertical or horizontal, to the style: upper case or lower case... |
|
|
Cell Extraction
We use the Hough transform to perform the line
directions, then the lines and columns are found by following them within
the image. For Hough transform, only the contour points vote. As we can
see in the figure below, the system is able to detect straight lines as
well as curved lines.
|
|
The figure below shows the projection space for horizontal lignes which has the shape of butterfly wings. |
|
Item Classification
Classification is carried out in three stages. The first stage allows to separate the verticals from the horizontal ones. The second identifies the subclasses in each category, and the third decides for the numerical ones, by arranging the problem of the zero which is often cut. |
|
Item Classification
We use a multi-layer network to classify the vertical
labels and the vertical labels with capital letters. The same type
of networks is used for the horizontal labels and the numerals.
|
|
|
Resultts
The colors correspond to the identified classes.
It is noticed thus for example that all the numerical ones are in red.
|
|
The table below gives the confusion matrix of the classes. |
|