OCR::Naive - convert images into text in an extremely naive fashion
The module implements a very simple and unsophisticated OCR by finding all known images in a larger image. The known images are mapped to text using the preexisting dictionary, and the text lines are returned. The interesting stuff here is the image ...
KARASIK/OCR-Naive-0.07 - 16 Feb 2009 13:17:04 UTC
OCR::PerfectCR - Perfect OCR (if you have perfect input).
OCR::PerfectCR is a fast, highly accurate "optical" character recognition engine requiring minimal training. How does it manage this, despite being written in pure perl? By ignoring most of the problems. OCR::PerfectCR requires that your input is in ...
JMASTROS/OCR-PerfectCR-0.03 - 06 Dec 2005 14:53:15 UTC
PDF::OCR - DEPRECATED get ocr and images out of a pdf file
Lets you get text out of pages in pdf documents. The whole process does not change your original pdf in any way. Please note this is only to get text out of images inside the pdf file, it does not check for genuine text inside the file- if any. For t...
LEOCHARRE/PDF-OCR-1.11 - 20 Apr 2009 13:01:05 UTC
PDF::OCR::Thorough - DEPRECATED extract text fom pdf document resorting to ocr as needed
Unlike PDF::OCR which assumes each page in the pdf document is a page scan- This script is more "thorough". How it works 1) The original.pdf is copied to tmp.pdf 2) tmp.pdf is split into page1.pdf page2.pdf etc.. 3) For each pageX.pdf, first we try r...
LEOCHARRE/PDF-OCR-1.11 - 20 Apr 2009 13:01:05 UTC
PDF::OCR::Thorough::Cached - DEPRECATED save ocr to text file for easy retrieval
This is just like PDF::OCR::Thorough, only the text is saved to a text file, so subseuent retrievals are snap quick. This inherits all the methods if PDF::OCR::Thorough $PDF::OCR::Thorough::Cached::ABS_CACHE_DIR Directory that will be the cache. The ...
LEOCHARRE/PDF-OCR-1.11 - 20 Apr 2009 13:01:05 UTC