Zakariyya Mughal

NAME

Image::Leptonica::Func::pageseg

VERSION

version 0.04

pageseg.c

   pageseg.c

      Top level page segmentation
          l_int32   pixGetRegionsBinary()

      Halftone region extraction
          PIX      *pixGenHalftoneMask()

      Textline extraction
          PIX      *pixGenTextlineMask()

      Textblock extraction
          PIX      *pixGenTextblockMask()

      Location of page foreground
          PIX      *pixFindPageForeground()

      Extraction of characters from image with only text
          l_int32   pixSplitIntoCharacters()
          BOXA     *pixSplitComponentWithProfile()

FUNCTIONS

pixFindPageForeground

BOX * pixFindPageForeground ( PIX *pixs, l_int32 threshold, l_int32 mindist, l_int32 erasedist, l_int32 pagenum, l_int32 showmorph, l_int32 display, const char *pdfdir )

  pixFindPageForeground()

      Input:  pixs (full resolution (any type or depth)
              threshold (for binarization; typically about 128)
              mindist (min distance of text from border to allow
                       cleaning near border; at 2x reduction, this
                       should be larger than 50; typically about 70)
              erasedist (when conditions are satisfied, erase anything
                         within this distance of the edge;
                         typically 30 at 2x reduction)
              pagenum (use for debugging when called repeatedly; labels
                       debug images that are assembled into pdfdir)
              showmorph (set to a negative integer to show steps in
                         generating masks; this is typically used
                         for debugging region extraction)
              display (set to 1  to display mask and selected region
                       for debugging a single page)
              pdfdir (subdirectory of /tmp where images showing the
                      result are placed when called repeatedly; use
                      null if no output requested)
      Return: box (region including foreground, with some pixel noise
                   removed), or null if not found

  Notes:
      (1) This doesn't simply crop to the fg.  It attempts to remove
          pixel noise and junk at the edge of the image before cropping.
          The input @threshold is used if pixs is not 1 bpp.
      (2) There are several debugging options, determined by the
          last 4 arguments.
      (3) If you want pdf output of results when called repeatedly,
          the pagenum arg labels the images written, which go into
          /tmp/<pdfdir>/<pagenum>.png.  In that case,
          you would clean out the /tmp directory before calling this
          function on each page:
              lept_rmdir(pdfdir);
              lept_mkdir(pdfdir);

pixGenHalftoneMask

PIX * pixGenHalftoneMask ( PIX *pixs, PIX **ppixtext, l_int32 *phtfound, l_int32 debug )

  pixGenHalftoneMask()

      Input:  pixs (1 bpp, assumed to be 150 to 200 ppi)
              &pixtext (<optional return> text part of pixs)
              &htfound (<optional return> 1 if the mask is not empty)
              debug (flag: 1 for debug output)
      Return: pixd (halftone mask), or null on error

pixGenTextblockMask

PIX * pixGenTextblockMask ( PIX *pixs, PIX *pixvws, l_int32 debug )

  pixGenTextblockMask()

      Input:  pixs (1 bpp, textline mask, assumed to be 150 to 200 ppi)
              pixvws (vertical white space mask)
              debug (flag: 1 for debug output)
      Return: pixd (textblock mask), or null on error

  Notes:
      (1) Both the input masks (textline and vertical white space) and
          the returned textblock mask are at the same resolution.
      (2) The result is somewhat noisy, in that small "blocks" of
          text may be included.  These can be removed by post-processing,
          using, e.g.,
             pixSelectBySize(pix, 60, 60, 4, L_SELECT_IF_EITHER,
                             L_SELECT_IF_GTE, NULL);

pixGenTextlineMask

PIX * pixGenTextlineMask ( PIX *pixs, PIX **ppixvws, l_int32 *ptlfound, l_int32 debug )

  pixGenTextlineMask()

      Input:  pixs (1 bpp, assumed to be 150 to 200 ppi)
              &pixvws (<return> vertical whitespace mask)
              &tlfound (<optional return> 1 if the mask is not empty)
              debug (flag: 1 for debug output)
      Return: pixd (textline mask), or null on error

  Notes:
      (1) The input pixs should be deskewed.
      (2) pixs should have no halftone pixels.
      (3) Both the input image and the returned textline mask
          are at the same resolution.

pixGetRegionsBinary

l_int32 pixGetRegionsBinary ( PIX *pixs, PIX **ppixhm, PIX **ppixtm, PIX **ppixtb, l_int32 debug )

  pixGetRegionsBinary()

      Input:  pixs (1 bpp, assumed to be 300 to 400 ppi)
              &pixhm (<optional return> halftone mask)
              &pixtm (<optional return> textline mask)
              &pixtb (<optional return> textblock mask)
              debug (flag: set to 1 for debug output)
      Return: 0 if OK, 1 on error

  Notes:
      (1) It is best to deskew the image before segmenting.
      (2) The debug flag enables a number of outputs.  These
          are included to show how to generate and save/display
          these results.

pixSplitComponentWithProfile

BOXA * pixSplitComponentWithProfile ( PIX *pixs, l_int32 delta, l_int32 mindel, PIX **ppixdebug )

  pixSplitComponentWithProfile()

      Input:  pixs (1 bpp, exactly one connected component)
              delta (distance used in extrema finding in a numa; typ. 10)
              mindel (minimum required difference between profile minimum
                      and profile values +2 and -2 away; typ. 7)
              &pixdebug (<optional return> debug image of splitting)
      Return: boxa (of c.c. after splitting), or null on error

  Notes:
      (1) This will split the most obvious cases of touching characters.
          The split points it is searching for are narrow and deep
          minimima in the vertical pixel projection profile, after a
          large vertical closing has been applied to the component.

pixSplitIntoCharacters

l_int32 pixSplitIntoCharacters ( PIX *pixs, l_int32 minw, l_int32 minh, BOXA **pboxa, PIXA **ppixa, PIX **ppixdebug )

  pixSplitIntoCharacters()

      Input:  pixs (1 bpp, contains only deskewed text)
              minw (minimum component width for initial filtering; typ. 4)
              minh (minimum component height for initial filtering; typ. 4)
              &boxa (<optional return> character bounding boxes)
              &pixa (<optional return> character images)
              &pixdebug (<optional return> showing splittings)

      Return: 0 if OK, 1 on error

  Notes:
      (1) This is a simple function that attempts to find split points
          based on vertical pixel profiles.
      (2) It should be given an image that has an arbitrary number
          of text characters.
      (3) The returned pixa includes the boxes from which the
          (possibly split) components are extracted.

AUTHOR

Zakariyya Mughal <zmughal@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Zakariyya Mughal.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.