NAME
Image::Leptonica::Func::dewarp1
VERSION
version 0.04
dewarp1.c
dewarp1.c
Basic operations and serialization
Create/destroy dewarp
L_DEWARP
*dewarpCreate
()
L_DEWARP
*dewarpCreateRef
()
void dewarpDestroy()
Create/destroy dewarpa
L_DEWARPA
*dewarpaCreate
()
L_DEWARPA
*dewarpaCreateFromPixacomp
()
void dewarpaDestroy()
l_int32 dewarpaDestroyDewarp()
Dewarpa insertion/extraction
l_int32 dewarpaInsertDewarp()
static l_int32 dewarpaExtendArraysToSize()
L_DEWARP
*dewarpaGetDewarp
()
Setting parameters to control rendering from the model
l_int32 dewarpaSetCurvatures()
l_int32 dewarpaUseBothArrays()
l_int32 dewarpaSetMaxDistance()
Dewarp serialized I/O
L_DEWARP
*dewarpRead
()
L_DEWARP
*dewarpReadStream
()
l_int32 dewarpWrite()
l_int32 dewarpWriteStream()
Dewarpa serialized I/O
L_DEWARPA
*dewarpaRead
()
L_DEWARPA
*dewarpaReadStream
()
l_int32 dewarpaWrite()
l_int32 dewarpaWriteStream()
Examples of usage
=================
See dewarpaCreateFromPixacomp()
for
an example of the basic
operations, starting from a set of 1 bpp images.
Basic functioning to dewarp a specific single page:
// Make the Dewarpa
for
the pages
L_Dewarpa
*dewa
= dewarpaCreate(1, 30, 1, 15, 50);
dewarpaSetCurvatures(dewa, -1, 5, -1, -1, -1);
// arrays
for
this example
// Do the page: start
with
a binarized image
Pix
*pixb
=
"binarize"
(pixs);
// Initialize a Dewarp
for
this page (
say
, page 214)
L_Dewarp
*dew
= dewarpCreate(pixb, 214);
// Insert in Dewarpa and obtain parameters
for
building the model
dewarpaInsertDewarp(dewa, dew);
// Do the work
dewarpBuildPageModel(dew, NULL); //
no
debugging
// Optionally set rendering parameters
// Apply model to the input pixs
Pix
*pixd
;
dewarpaApplyDisparity(dewa, 214, pixs, 255, 0, 0,
&pixd
, NULL);
pixDestroy(
&pixb
);
Basic functioning to dewarp many pages:
L_Dewarpa
*dewa
= dewarpaCreate(10, 30, 1, 15, 50);
// Optionally set rendering parameters
dewarpaSetCurvatures(dewa, -1, 10, -1, -1, -1);
// array
for
this example
// Do first page: start
with
a binarized image
Pix
*pixb
=
"binarize"
(pixs);
// Initialize a Dewarp
for
this page (
say
, page 1)
L_Dewarp
*dew
= dewarpCreate(pixb, 1);
// Insert in Dewarpa and obtain parameters
for
building the model
dewarpaInsertDewarp(dewa, dew);
// Do the work
dewarpBuildPageModel(dew, NULL); //
no
debugging
dewarpMinimze(dew); // remove most heap storage
pixDestroy(
&pixb
);
// Do the other pages the same way
...
// Apply models to
each
page;
if
the page model is invalid,
// to dewarpaInsertRefModels() is optional, because it is called
// by dewarpaApplyDisparity() on the first page it acts on.
// detailed information about the page models
[For
each
page, where pixs is the fullres image to be dewarped] {
L_Dewarp
*dew
= dewarpaGetDewarp(dewa, pageno);
if
(dew) { // disparity model
exists
Pix
*pixd
;
dewarpaApplyDisparity(dewa, pageno, pixs, 255,
0, 0,
&pixd
, NULL);
dewarpMinimize(dew); // clean out the pix and fpix arrays
// Squirrel pixd away somewhere ...)
}
}
Basic functioning to dewarp a small set of pages, potentially
using models from nearby pages:
// (1) Generate a set of binarized images in the vicinity of the
// pages to be dewarped. We will attempt to compute models
//
for
pages from
'firstpage'
to
'lastpage'
.
// Store the binarized images in a compressed array of
// size
'n'
, where
'n'
is the number of images to be stored,
// and where the offset is the first page.
PixaComp
*pixac
= pixacompCreateInitialized(n, firstpage, NULL,
IFF_TIFF_G4);
for
(i = firstpage; i <= lastpage; i++) {
Pix
*pixb
=
"binarize"
(pixs);
pixacompReplacePix(pixac, i, pixb, IFF_TIFF_G4);
pixDestroy(
&pixb
);
}
// (2) Make the Dewarpa
for
the pages.
L_Dewarpa
*dewa
=
dewarpaCreateFromPixacomp(pixac, 30, 15, 20);
// in this example
// (3) Finally, apply the models. For page
'firstpage'
with
image pixs:
L_Dewarp
*dew
= dewarpaGetDewarp(dewa, firstpage);
if
(dew) { // disparity model
exists
Pix
*pixd
;
dewarpaApplyDisparity(dewa, firstpage, pixs, 255, 0, 0,
&pixd
, NULL);
dewarpMinimize(dew);
}
Because in general some pages will not have enough text to build a
model, we fill in
for
those pages
with
a reference to the page
model to
use
. Both the target page and the reference page must
(
with
only vertical disparity) or the full model of a nearby page.
Minimizing the data in a model by stripping out images,
numas, and full resolution disparity arrays:
dewarpMinimize(dew);
This can be done at any
time
to save memory. Serialization does
You can apply any model (in a dew), stripped or not, to another image:
// For all pages
with
invalid models, assign the nearest valid
// page model
with
same parity.
dewarpaInsertRefModels(dewa, 0, 0);
// You can then apply to
'newpix'
the page model that was assigned
// to
'pageno'
, giving the result in pixd:
Pix
*pixd
;
dewarpaApplyDisparity(dewa, pageno, newpix, 255, 0, 0,
&pixd
, NULL);
You can apply the disparity arrays to a deliberately undercropped
image. Suppose that you undercrop by (left, right, top, bot), so
that the disparity arrays are aligned
with
their origin at (left, top).
Dewarp the undercropped image
with
:
Pix
*pixd
;
dewarpaApplyDisparity(dewa, pageno, undercropped_pix, 255,
left, top,
&pixd
, NULL);
Description of the approach to analyzing page image distortion
==============================================================
When a book page is scanned, there are several possible causes
for
the text lines to appear to be curved:
(1) A barrel (fish-eye) effect because the camera is at
a finite distance from the page. Take the normal from
the camera to the page (the
'optic axis'
). Lines on
the page
"below"
this point will appear to curve upward
(negative curvature); lines
"above"
this will curve downward.
(2) Radial distortion from the camera lens. Probably not
a big factor.
(3) Local curvature of the page in to (or out of) the image
plane (which is perpendicular to the optic axis).
This
has
no
effect
if
the page is flat.
In the following, the optic axis is in the z direction and is
perpendicular to the xy plane;, the book is assumed to be aligned
so that y is approximately along the binding.
The goal is to compute the
"disparity"
field, D(x,y), which
is actually a vector composed of the horizontal and vertical
disparity fields H(x,y) and V(x,y). Each of these is a
local
function that gives the amount
each
point in the image is
required to move in order to rectify the horizontal and vertical
lines. It would also be nice to
"flatten"
the page to compensate
for
effect (3), foreshortening due to bending of the page into
the z direction, but that is more difficult.
Effects (1) and (2) can be directly compensated by calibrating
the scene, using a flat page
with
horizontal and vertical lines.
Then H(x,y) and V(x,y) can be found as two (non-parametric) arrays
of
values
. Suppose this
has
been done. Then the remaining
distortion is due to (3).
We consider the simple situation where the page bending is independent
of y, and is described by alpha(x), where alpha is the angle between
the normal to the page and the optic axis.
cos
(alpha(x)) is the
local
compression factor of the page image in the horizontal direction, at x.
Thus,
if
we know alpha(x), we can compute the disparity H(x) required
to flatten the image by simply integrating 1/
cos
(alpha), and we could
compute the remaining disparities, H(x,y) and V(x,y), from the
page content, as described below. Unfortunately, we don't know
alpha. What
do
we know? If there are horizontal text lines
on the page, we can compute the vertical disparity, V(x,y), which
is the
local
translation required to make the text lines parallel
to the rasters. If the margins are left and right aligned, we can
also estimate the horizontal disparity, H(x,y), required to have
uniform margins. All that can be done from the image alone,
assuming we have text lines covering a sufficient part of the page.
What about alpha(x)? The basic question relating to (3) is this:
Is it possible, using the shape of the text lines alone,
to compute both the vertical and horizontal disparity fields?
The underlying problem is to separate the line curvature effects due
to the camera view from those due to actual bending of the page.
I believe the proper way to
do
this is to make some measurements
based on the camera setup, which will depend mostly on the distance
of the camera from the page, and to a smaller extent on the location
of the optic axis
with
respect to the page.
Here is the procedure. Photograph a page
with
a fine 2D line grid
several
times
,
each
with
a different slope near the binding.
This can be done by placing the grid page on books that have
different shapes z(x) near the binding. For
each
one you can
measure, near the binding:
(1) ds/dy, the vertical rate of change of slope of the horizontal lines
(2) the
local
horizontal compression of the vertical lines due
to the page angle dz/dx.
As mentioned above, the
local
horizontal compression is simply
cos
(dz/dx). But the measurement you can make on an actual book
page is (1). The difficulty is to generate (2) from (1).
Back to the procedure. The function in (1), ds/dy, likely needs
to be measured at a few y locations, because the relation
between (1) and (2) may weakly depend on the y-location
with
respect to the y-coordinate of the optic axis of the camera.
From these measurements you can determine,
for
the camera setup
that you have, the
local
horizontal compression,
cos
(dz/dx), as a
function of the both vertical location (y) and your measured vertical
derivative of the text line slope there, ds/dy. Then
with
appropriate smoothing of your measured
values
, you can set up a
horizontal disparity array to correct
for
the compression due
to dz/dx.
Now consider V(x,0) and V(x,h), the vertical disparity along
the top and bottom of the image. With a little thought you
can convince yourself that the
local
foreshortening,
as a function of x, is proportional to the difference
between the slope of V(x,0) and V(x,h). The horizontal
disparity can then be computed by integrating the
local
foreshortening
over x. Integration of the slope of V(x,0) and V(x,h) gives
the vertical disparity itself. We have to normalize to h, the
height of the page. So the very simple result is that
H(x) ~ (V(x,0) - V(x,h)) / h [1]
which is easily computed. There is a proportionality constant
that depends on the ratio of h to the distance to the camera.
Can we actually believe this
for
the case where the bending
is independent of y? I believe the answer is yes,
as long as you first remove the apparent distortion due
to the camera being at a finite distance.
If you know the intersection of the optical axis
with
the page
and the distance to the camera, and
if
the page is perpendicular
to the optic axis, you can compute the horizontal and vertical
disparities due to (1) and (2) and remove them. The resulting
distortion should be entirely due to bending (3),
for
which
the relation
Hx(x) dx = C * ((Vx(x,0) - Vx(x, h))/h) dx [2]
holds
for
each
point in x (Hx and Vx are partial derivatives w/rt x).
Integrating over x, and using H(0) = 0, we get the result [1].
I believe this result holds differentially
for
each
value of y, so
that in the case where the bending is not independent of y,
the expression (V(x,0) - V(x,h)) / h goes over to Vy(x,y). Then
H(x,y) = Integral(0,x) (Vyx(x,y) dx) [3]
where Vyx() is the partial derivative of V w/rt both x and y.
It would be nice
if
there were a simple mathematical relation between
the horizontal and vertical disparities
for
the situation
where the paper bends without stretching or kinking.
I had hoped to get a relation between H and V, such as
Hx(x,y) ~ Vy(x,y), which would imply that H and V are real
and imaginary parts of a complex potential,
each
of which
satisfy the laplace equation. But then the gradients of the
two potentials would be normal, and that does not appear to be the case.
Thus, the questions of proving the relations above (
for
small bending),
or finding a simpler relation between H and V than those equations,
remain
open
. So far, we have only used [1]
for
the horizontal
disparity H(x).
to find V(x,y). Then, we
try
to compute H(x,y) that will align
the text vertically on the left and right margins. This is not
always possible -- sometimes the right margin is not right justified.
valid page model
for
dewarping a page, but this requirement can
be forced using dewarpaUseFullModel().
As described above, one can add a y-independent component of
the horizontal disparity H(x) to counter the foreshortening
effect due to the bending of the page near the binding.
This requires widening the image on the side near the binding,
and we
do
not provide this option here. However, we
do
provide
a function that will generate this disparity field:
fpixExtraHorizDisparity()
Here is the basic outline
for
building the disparity arrays.
(1) Find lines going approximately through the center of the
text in
each
text line. Accept only lines that are
close
in
length
to the longest line.
(2) Use these lines to generate a regular and highly subsampled
vertical disparity field V(x,y).
(3) Interpolate this to generate a full resolution vertical
disparity field.
(4) For lines that are sufficiently long, determine
if
the lines
are left and right-justified, and
if
so, construct a highly
subsampled horizontal disparity field H(x,y) that will bring
them into alignment.
(5) Interpolate this to generate a full resolution horizontal
disparity field.
(6) Apply the vertical dewarping, followed by the horizontal dewarping.
Step (1) is clearly described by the code in pixGetTextlineCenters().
Steps (2) and (3) follow directly from the data in step (1),
and constitute the bulk of the work done in dewarpBuildPageModel().
Virtually all the noise in the data is smoothed out by doing
least-square quadratic fits, first horizontally to the data
points representing the text line centers, and then vertically.
The trick is to sample these lines on a regular grid.
First
each
horizontal line is sampled at equally spaced
intervals horizontally. We thus get a set of points,
one in
each
line, that are vertically aligned, and
the data we represent is the vertical distance of
each
point
from the min or max value on the curve, depending on the
sign of the curvature component. Each of these vertically
aligned sets of points constitutes a sampled vertical disparity,
and we
do
a LS quartic fit to
each
of them, followed by
vertical sampling at regular intervals. We now have a subsampled
grid of points, all equally spaced, giving at
each
point the
local
vertical disparity. Finally, the full resolution vertical disparity
is formed by interpolation. All the least square fits
do
a
great job of smoothing everything out, as can be observed by
the contour maps that are generated
for
the vertical disparity field.
FUNCTIONS
dewarpCreate
L_DEWARP * dewarpCreate ( PIX *pixs, l_int32 pageno )
dewarpCreate()
Input: pixs (1 bpp)
pageno (page number)
Return: dew (or null on error)
Notes:
(1) The input pixs is either full resolution or 2x reduced.
(2) The page number is typically 0-based. If scanned from a book,
the even pages are usually on the left. Disparity arrays
built
for
even pages should only be applied to even pages.
dewarpCreateRef
L_DEWARP * dewarpCreateRef ( l_int32 pageno, l_int32 refpage )
dewarpCreateRef()
Input: pageno (this page number)
refpage (page number of dewarp disparity arrays to be used)
Return: dew (or null on error)
Notes:
(1) This specifies which dewarp struct should be used
for
the
given
page. It is placed in dewarpa
for
pages
for
which
no
model can be built.
(2) This page and the reference page have the same parity and
the reference page is the closest page
with
a disparity model
to this page.
dewarpDestroy
void dewarpDestroy ( L_DEWARP **pdew )
dewarpDestroy()
Input:
&dew
(<will be set to null
before
returning>)
Return: void
dewarpRead
L_DEWARP * dewarpRead ( const char *filename )
dewarpRead()
Input: filename
Return: dew, or null on error
dewarpReadStream
L_DEWARP * dewarpReadStream ( FILE *fp )
dewarpReadStream()
Input: stream
Return: dew, or null on error
Notes:
(1) The dewarp struct is stored in minimized
format
,
with
only
subsampled disparity arrays.
(2) The sampling and extra horizontal disparity parameters are
stored here. During generation of the dewarp struct, they
are passed in from the dewarpa. In readback, it is assumed
that they are (a) the same
for
each
page and (b) the same
as the
values
used to create the dewarpa.
dewarpWrite
l_int32 dewarpWrite ( const char *filename, L_DEWARP *dew )
dewarpWrite()
Input: filename
dew
Return: 0
if
OK, 1 on error
dewarpWriteStream
l_int32 dewarpWriteStream ( FILE *fp, L_DEWARP *dew )
dewarpWriteStream()
Input: stream (opened
for
"wb"
)
dew
Return: 0
if
OK, 1 on error
Notes:
(1) This should not be written
if
there is
no
sampled
vertical disparity array, which means that
no
model
has
been built
for
this page.
dewarpaCreate
L_DEWARPA * dewarpaCreate ( l_int32 nptrs, l_int32 sampling, l_int32 redfactor, l_int32 minlines, l_int32 maxdist )
dewarpaCreate()
Input: nptrs (number of dewarp page ptrs; typically the number of pages)
sampling (
use
0
for
default
value; the minimum allowed is 8)
redfactor (of input images: 1 is full resolution; 2 is 2x reduced)
minlines (minimum number of lines to
accept
;
use
0
for
default
)
maxdist (
for
locating reference disparity;
use
-1
for
default
)
Return: dewa (or null on error)
Notes:
(1) The sampling, minlines and maxdist parameters will be
applied to all images.
(2) The sampling factor is used
for
generating the disparity arrays
factor that is half the sampling you want on the full resolution
images.
(3) Use
@redfactor
= 1
for
full resolution; 2
for
2x reduction.
All input images must be at one of these two resolutions.
(4)
@minlines
is the minimum number of nearly full-
length
lines
required to generate a vertical disparity array. The
default
number is 15. Use a smaller number to
accept
a questionable
array, but not smaller than 4.
(5) When a model can't be built
for
a page, it looks up to
@maxdist
in either direction
for
a valid model
with
the same page parity.
Use -1
for
the
default
value of
@maxdist
;
use
0 to avoid using
a
ref
model.
(6) The ptr array is expanded as necessary to accommodate page images.
dewarpaCreateFromPixacomp
L_DEWARPA * dewarpaCreateFromPixacomp ( PIXAC *pixac, l_int32 useboth, l_int32 sampling, l_int32 minlines, l_int32 maxdist )
dewarpaCreateFromPixacomp()
Input: pixac (pixacomp of G4, 1 bpp images;
with
1x1x1 placeholders)
useboth (0
for
vert disparity; 1
for
both vert and horiz)
sampling (
use
-1 or 0
for
default
value; otherwise minimum of 5)
minlines (minimum number of lines to
accept
; e.g., 10)
maxdist (
for
locating reference disparity;
use
-1
for
default
)
Return: dewa (or null on error)
Notes:
(1) The returned dewa
has
disparity arrays calculated and
(2) The sampling, minlines and maxdist parameters are
applied to all images. See notes in dewarpaCreate()
for
details.
(3) The pixac is full. Placeholders,
if
any, are w=h=d=1 images,
and the real input images are 1 bpp at full resolution.
They are assumed to be cropped to the actual page regions,
and may be arbitrarily sparse in the array.
(4) The output dewarpa is indexed by the page number.
The offset in the pixac gives the mapping between the
array
index
in the pixac and the page number.
(5) This adds the
ref
page models.
(6) This can be used to make models
for
any desired set of pages.
The direct models are only made
for
pages
with
images in
the pixacomp; the
ref
models are made
for
pages of the
same parity within
@maxdist
of the nearest direct model.
dewarpaDestroy
void dewarpaDestroy ( L_DEWARPA **pdewa )
dewarpaDestroy()
Input:
&dewa
(<will be set to null
before
returning>)
Return: void
dewarpaDestroyDewarp
l_int32 dewarpaDestroyDewarp ( L_DEWARPA *dewa, l_int32 pageno )
dewarpaDestroyDewarp()
Input: dewa
pageno (of dew to be destroyed)
Return: 0
if
OK, 1 on error
dewarpaGetDewarp
L_DEWARP * dewarpaGetDewarp ( L_DEWARPA *dewa, l_int32 index )
dewarpaGetDewarp()
Input: dewa (populated
with
dewarp structs
for
pages)
index
(into dewa: this is the pageno)
Return: dew (handle; still owned by dewa), or null on error
dewarpaInsertDewarp
l_int32 dewarpaInsertDewarp ( L_DEWARPA *dewa, L_DEWARP *dew )
dewarpaInsertDewarp()
Input: dewarpa
dewarp (to be added)
Return: 0
if
OK, 1 on error
Notes:
(1) This inserts the dewarp into the array, which now owns it.
It also keeps track of the largest page number stored.
It must be done
before
the disparity model is built.
(2) Note that this differs from the usual method of filling out
arrays in leptonica, where the arrays are compact and
new elements are typically added to the end. Here,
the dewarp can be added anywhere, even beyond the initial
allocation.
dewarpaRead
L_DEWARPA * dewarpaRead ( const char *filename )
dewarpaRead()
Input: filename
Return: dewa, or null on error
dewarpaReadStream
L_DEWARPA * dewarpaReadStream ( FILE *fp )
dewarpaReadStream()
Input: stream
Return: dewa, or null on error
Notes:
(1) The serialized dewarp contains a Numa that gives the
(increasing) page number of the dewarp structs that are
contained.
(2) Reference pages are added in
after
readback.
dewarpaSetCurvatures
l_int32 dewarpaSetCurvatures ( L_DEWARPA *dewa, l_int32 max_linecurv, l_int32 min_diff_linecurv, l_int32 max_diff_linecurv, l_int32 max_edgecurv, l_int32 max_diff_edgecurv )
dewarpaSetCurvatures()
Input: dewa
max_linecurv (-1
for
default
)
min_diff_linecurv (-1
for
default
; 0 to
accept
all models)
max_diff_linecurv (-1
for
default
)
max_edgecurv (-1
for
default
)
max_diff_edgecurv (-1
for
default
)
Return: 0
if
OK, 1 on error
Notes:
(1) Approximating the line by a quadratic, the coefficent
of the quadratic term is the curvature, and distance
units are in pixels (of course). The curvature is very
small, so we multiply by 10^6 and express the constraints
on the model curvatures in micro-units.
(2) This sets five curvature thresholds:
* the maximum absolute value of the vertical disparity
line curvatures
* the minimum absolute value of the largest difference in
vertical disparity line curvatures (Use a value of 0
to
accept
all models.)
* the maximum absolute value of the largest difference in
vertical disparity line curvatures
* the maximum absolute value of the left and right edge
curvature
for
the horizontal disparity
* the maximum absolute value of the difference between
left and right edge curvature
for
the horizontal disparity
all in micro-units,
for
dewarping to take place.
Use -1
for
default
values
.
(3) An image
with
a line curvature less than about 0.00001
has
fairly straight textlines. This is 10 micro-units.
(4) For example,
if
@max_linecurv
== 100, this would prevent dewarping
if
any of the lines
has
a curvature exceeding 100 micro-units.
A model having maximum line curvature larger than about 150
micro-units should probably not be used.
(5) A model having a left or right edge curvature larger than
about 100 micro-units should probably not be used.
dewarpaSetMaxDistance
l_int32 dewarpaSetMaxDistance ( L_DEWARPA *dewa, l_int32 maxdist )
dewarpaSetMaxDistance()
Input: dewa
maxdist (
for
using
ref
models)
Return: 0
if
OK, 1 on error
Notes:
(1) This sets the maxdist field.
dewarpaUseBothArrays
l_int32 dewarpaUseBothArrays ( L_DEWARPA *dewa, l_int32 useboth )
dewarpaUseBothArrays()
Input: dewa
useboth (0
for
false, 1
for
true)
Return: 0
if
OK, 1 on error
Notes:
(1) This sets the useboth field. If set, this will attempt
to apply both vertical and horizontal disparity arrays.
Note that a model
with
only a vertical disparity array will
always be valid.
dewarpaWrite
l_int32 dewarpaWrite ( const char *filename, L_DEWARPA *dewa )
dewarpaWrite()
Input: filename
dewa
Return: 0
if
OK, 1 on error
dewarpaWriteStream
l_int32 dewarpaWriteStream ( FILE *fp, L_DEWARPA *dewa )
dewarpaWriteStream()
Input: stream (opened
for
"wb"
)
dewa
Return: 0
if
OK, 1 on error
AUTHOR
Zakariyya Mughal <zmughal@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Zakariyya Mughal.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.