|
|
This system has been developed by Tunde Akindele. This is a page segmentation method employed in a document analysis system. This method allows one to cut a document page image into polygonal blocks as well as into classical rectangular blocks. The inter-column and inter-paragraph gaps are extracted as horizontal and vertical lines. The points of intersection between horizontal and vertical lines are treated as vertices of polygonal blocks. With the aid of the 4-connected chain code and an intersection table, simple isothetic polygonal blocks are constructed from these points of intersection. The straight line joining two points of intersection corresponding to two neighboring 1 entries in the intersection table is treated as a line segment that forms a side of a polygonal block. This method is tolerant to skewed documents and also robust enough to be applied to obtain polygonal blocks of any shape and any number of sides. |
|
The horizontal and vertical white gaps in the page image are extracted as white rectangles by extracting and merging white segments in the image. To avoid the use of the white spaces in character images, or those between characters, words and/or lines, white segments whose lengths or widths are less than a certain threshold are discarded. |
|
Graph
construction
Having extracted horizontal and vertical lines
respectively from the horizontal and vertical white gaps in the page image,
we proceed to construct polygonal blocks from these lines. To facilitate
this construction, we make use of an intersection table inside which we
walk around with the aid of 4-connected chain codes and a direction table.
|
|
Polygonal
block extraction
Each vertex of a simple isothetic polygon is directly
connected to the vertex on its right or left and also to the vertex above
or below it. Therefore, to move from one vertex to another, one needs to
move to the right, or to the left, or up, or down. This has lead to the
use of the 4-connected chain code. A chain code is an integer from 0 to
3 that indicates the direction in which to move from the current vertex
to get to the next one as shown in figure~\ref{ccode}. Thus a chain code
of 0 indicates a movement to the right, 1, a movement up, 2, a movement
to the left and 3, a movement down.
|
|
Polygonal
block extraction
The search for blocks is done by walking through
$IT$ row-by-row. The idea is to start the search from each ``1'' entry
in $IT$ and then walk through the table with the aid of the entries in
$DT$ until the starting point is reached again, noting the direction of
movement at each change of direction. The search for a ``1'' entry in anydirection
starts at a given point and continues in the direction until a ``1'' entry
is reached or until the edge of the table.
|
|
|
The method has been implemented using C codes and on a SUN SPARC station IPX under SunView environment. It has been tested using a series of document pages obtained from scientific journals and other magazines. The pages are scanned with Hewlett Packard Scan Jet IIc scanner, with a resolution of 300dpi, producing binary images of dimensions 2550 x 3500 pixels. It gives satisfactory results on all tested images and takes about 6
seconds par page. Figure below shows an example of document pages where
the structure is mosaical.
|
|