The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

PDF::Parse - Library for parsing a PDF file

SYNOPSIS

  use PDF::Parse;

  $pdf=PDF::Parse->new ;
  $pdf=PDF::Parse->new(filename);

  $result=$pdf->TargetFile( filename );

  print "is a pdf file\n" if ( $pdf->IsaPDF ) ;
  print "Has ",$pdf->Pages," Pages \n";
  print "Use a PDF Version  ",$pdf->Version ," \n";

  print "filename with title",$pdf->GetInfo("Title"),"\n";
  print "and with subject ",$pdf->GetInfo("Subject"),"\n";
  print "was written by ",$pdf->GetInfo("Author"),"\n";
  print "in date ",$pdf->GetInfo("CreationDate"),"\n";
  print "using ",$pdf->GetInfo("Creator"),"\n";
  print "and converted with ",$pdf->GetInfo("Producer"),"\n";
  print "The last modification occurred ",$pdf->GetInfo("ModDate"),"\n";
  print "The associated keywords are ",$pdf->GetInfo("Keywords"),"\n";

  my (startx,starty, endx,endy) = $pdf->PageSize ;
  my $rotation = $pdf->PageRotation ;

DESCRIPTION

The main purpose of the PDF library is to provide classes and functions that allow to read and manipulate PDF files with perl. PDF stands for Portable Document Format and is a format proposed by Adobe. For more details abour PDF, refer to:

http://www.adobe.com/

For a detailed documentation, see the PDF library.

The library is at is very beginning of development. The main idea is to provide some "basic" modules for access the information contained in a PDF file. Even if at this moment is in an early development stage, the three little scripts provided with the library ( is_pdf, pdf_version, and pdf_pages ) show that it is usable.

is_pdf script test a list of files in order divide the PDF file from the non PDF using the info provided by the files themselves. It doesn't use the .pdf extension, it uses the information contained in the file.

pdf_version returns the PDF level used for writing a file.

pdf_pages gives the number of pages of a PDF file.

Constructor

new ( [ filename ] )

This is the constructor of a new PDF object. If the filename is missing, it returns an empty PDF descriptor ( can be filled with $pdf->TargetFile). Otherwise, It acts as the TargetFile method.

Methods

The available methods are :

TargetFile ( filename )

This method links the filename to the pdf descriptor and check the header.

Version

Returns the PDF version used for writing the object file.

Pages

Returns the number of pages of the object file. As side effect, the PDF object contains part of the Catalog structure after the call ( more specifically, part of the Root Page ).

PageSize

Returns the size of the page of the object file. As side effect, the PDF object contains part of the Catalog structure after the call ( more specifically, part of the Root Page ).

Note: At this development level, you cannot guess the size of a single page. Only the size of the root page is available. Generally, the size of all the page is the same, because it's usually inherited from the root page , but this could not be true if, for example, you merge two different document together.

PageRotation

Returns the rotation of the document with the PDF conventions:

 0 ==>   0 degree (default)
 1 ==>  90 degrees
 2 ==> 180 degrees
 3 ==> 270 degrees

Note: It suffer of the same limitations of the the PageSize method.

Variables

There are 2 variables that can be accessed:

$PDF::VERSION

Contain the version of the library installed

$PDF::Verbose

This variable is false by default. Change the value if you want more verbose output messages from library

Copyright

  Copyright 1998, Antonio Rosella antro@technologist.com

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Availability

The latest version of this library is likely to be available from:

http://www.geocities.com/CapeCanaveral/Hangar/4794/

1 POD Error

The following errors were encountered while parsing the POD:

Around line 471:

=back doesn't take any parameters, but you said =back 4