NAME
Spreadsheet::ParseExcel - Get information from Excel file
SYNOPSIS
use strict;
use Spreadsheet::ParseExcel;
my $oExcel = new Spreadsheet::ParseExcel;
#1.1 Normal Excel97
my $oBook = $oExcel->Parse('Excel/Test97.xls');
my($iR, $iC, $oWkS, $oWkC);
print "FILE :", $oBook->{File} , "\n";
print "COUNT :", $oBook->{SheetCount} , "\n";
print "AUTHOR:", $oBook->{Author} , "\n";
for(my $iSheet=0; $iSheet < $oBook->{SheetCount} ; $iSheet++) {
$oWkS = $oBook->{Worksheet}[$iSheet];
print "--------- SHEET:", $oWkS->{Name}, "\n";
for(my $iR = $oWkS->{MinRow} ;
defined $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ; $iR++) {
for(my $iC = $oWkS->{MinCol} ;
defined $oWkS->{MaxCol} && $iC <= $oWkS->{MaxCol} ; $iC++) {
$oWkC = $oWkS->{Cells}[$iR][$iC];
print "( $iR , $iC ) =>", $oWkC->Value, "\n" if($oWkC); # Formatted Value
print "( $iR , $iC ) =>", $oWkC->{Val}, "\n" if($oWkC); # Original Value
}
}
}
new interface
use strict;
use Spreadsheet::ParseExcel;
my $oBook =
Spreadsheet::ParseExcel::Workbook->Parse('Excel/Test97.xls');
my($iR, $iC, $oWkS, $oWkC);
foreach my $oWkS (@{$oBook->{Worksheet}}) {
print "--------- SHEET:", $oWkS->{Name}, "\n";
for(my $iR = $oWkS->{MinRow} ;
defined $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ; $iR++) {
for(my $iC = $oWkS->{MinCol} ;
defined $oWkS->{MaxCol} && $iC <= $oWkS->{MaxCol} ; $iC++) {
$oWkC = $oWkS->{Cells}[$iR][$iC];
print "( $iR , $iC ) =>", $oWkC->Value, "\n" if($oWkC);
}
}
}
DESCRIPTION
Spreadsheet::ParseExcel makes you to get information from Excel95, Excel97, Excel2000 file.
Functions
- new
-
$oExcel = new Spreadsheet::ParseExcel( [ CellHandler => \&subCellHandler, NotSetCell => undef | 1, ]);
Constructor.
- CellHandler (experimental)
-
specify callback function when a cell is detected.
subCellHandler gets arguments like below:
sub subCellHandler ($oBook, $iSheet, $iRow, $iCol, $oCell);
CAUTION : The atributes of Workbook may not be complete. This function will be called almost order by rows and columns. Take care almost, not perfectly.
- NotSetCell (experimental)
-
specify set or not cell values to Workbook object.
- Parse
-
$oWorkbook = $oParse->Parse($sFileName [, $oFmt]);
return "Workbook" object. if error occurs, returns undef.
- $sFileName
-
name of the file to parse
From 0.12 (with OLE::Storage_Lite v.0.06), scalar reference of file contents (ex. \$sBuff) or IO::Handle object (inclucdng IO::File etc.) are also available.
- $oFmt
-
"Formatter Class" to format the value of cells.
- ColorIdxToRGB
-
$sRGB = $oParse->ColorIdxToRGB($iColorIdx);
ColorIdxToRGB returns RGB string corresponding to specified color index. RGB string has 6 charcters, representing RGB hex value. (ex. red = 'FF0000')
Workbook
Spreadsheet::ParseExcel::Workbook
Workbook class has these methods :
- Parse
-
(class method) : same as Spreadsheet::ParseExcel
- Worksheet
-
$oWorksheet = $oBook->Worksheet($sName);
Worksheet returns a Worksheet object with $sName or undef. If there is no worksheet with $sName and $sName contains only digits, it returns a Worksheet object at that position.
Workbook class has these properties :
- File
-
Name of the file
- Author
-
Author of the file
- Flag1904
-
If this flag is on, date of the file count from 1904.
- Version
-
Version of the file
- SheetCount
-
Numbers of "Worksheet" s in that Workbook
- Worksheet[SheetNo]
-
Array of "Worksheet"s class
- PrintArea[SheetNo]
-
Array of PrintArea array refs.
Each PrintArea is : [ StartRow, StartColumn, EndRow, EndColumn]
- PrintTitle[SheetNo]
-
Array of PrintTitle hash refs.
Each PrintTitle is : { Row => [StartRow, EndRow], Column => [StartColumn, EndColumn]}
Worksheet
Spreadsheet::ParseExcel::Worksheet
Worksheet class has these methods:
- Cell ( ROW, COL )
-
Return the Cell iobject at row ROW and column COL if it is defined. Otherwise return undef.
- RowRange ()
-
Return a two-element list (MIN, MAX) containing the minimum and maximum of defined rows in the worksheet If there is no row defined MAX is smaller than MIN.
- ColRange ()
-
Return a two-element list (MIN, MAX) containing the minimum and maximum of defined columns in the worksheet If there is no row defined MAX is smaller than MIN.
Worksheet class has these properties:
- Name
-
Name of that Worksheet
- DefRowHeight
-
Default height of rows
- DefColWidth
-
Default width of columns
- RowHeight[Row]
-
Array of row height
- ColWidth[Col]
-
Array of column width (undef means DefColWidth)
- Cells[Row][Col]
-
Array of "Cell"s infomation in the worksheet
- Landscape
-
Print in horizontal(0) or vertical (1).
- Scale
-
Print scale.
- FitWidth
-
Number of pages with fit in width.
- FitHeight
-
Number of pages with fit in height.
- PageFit
-
Print with fit (or not).
- PaperSize
-
Papar size. The value is like below:
Letter 1, LetterSmall 2, Tabloid 3 , Ledger 4, Legal 5, Statement 6 , Executive 7, A3 8, A4 9 , A4Small 10, A5 11, B4 12 , B5 13, Folio 14, Quarto 15 , 10x14 16, 11x17 17, Note 18 , Envelope9 19, Envelope10 20, Envelope11 21 , Envelope12 22, Envelope14 23, Csheet 24 , Dsheet 25, Esheet 26, EnvelopeDL 27 , EnvelopeC5 28, EnvelopeC3 29, EnvelopeC4 30 , EnvelopeC6 31, EnvelopeC65 32, EnvelopeB4 33 , EnvelopeB5 34, EnvelopeB6 35, EnvelopeItaly 36 , EnvelopeMonarch 37, EnvelopePersonal 38, FanfoldUS 39 , FanfoldStdGerman 40, FanfoldLegalGerman 41, User 256
- PageStart
-
Start page number.
- UsePage
-
Use own start page number (or not).
-
Mergins for left, right, top, bottom, header and footer.
- HCenter
-
Print in horizontal center (or not)
- VCenter
-
Print in vertical center (or not)
- Header
-
Content of print header. Please refer Excel Help.
-
Content of print footer. Please refer Excel Help.
- PrintGrid
-
Print with Gridlines (or not)
- PrintHeaders
-
Print with headings (or not)
- NoColor
-
Print in black-white (or not).
- Draft
-
Print in draft mode (or not).
- Notes
-
Print with notes (or not).
- LeftToRight
-
Print left to right(0) or top to down(1).
- HPageBreak
-
Array ref of horizontal page breaks.
- VPageBreak
-
Array ref of vertical page breaks.
- MergedArea
-
Array ref of merged areas. Each merged area is : [ StartRow, StartColumn, EndRow, EndColumn]
Cell
Spreadsheet::ParseExcel::Cell
Cell class has these properties:
- Value
-
Method Formatted value of that cell
- Val
-
Original Value of that cell
- Type
-
Kind of that cell ('Text', 'Numeric', 'Date')
- Code
-
Character code of that cell (undef, 'ucs2', '_native_') undef tells that cell seems to be ascii. '_native_' tells that cell seems to be 'sjis' or something like that.
- Format
-
"Format" for that cell.
- Merged
-
That cells is merged (or not).
- Rich
-
Array ref of font informations about each characters.
Each entry has : [ Start Position, Font Object]
For more information please refer sample/dmpExR.pl
Format
Spreadsheet::ParseExcel::Format
Format class has these properties:
- Font
-
"Font" object for that Format.
- AlignH
-
Horizontal Alignment.
0: (standard), 1: left, 2: center, 3: right, 4: fill , 5: justify, 7:equal_space
Notice: 6 may be merge but it seems not to work.
- AlignV
-
Vertical Alignment.
0: top, 1: vcenter, 2: bottom, 3: vjustify, 4: vequal_space
- Indent
-
Number of indent
- Wrap
-
Wrap (or not).
- Shrink
-
Display in shrinking (or not)
- Rotate
-
In Excel97, 2000 : degrees of string rotation. In Excel95 or earlier : 0: No rotation, 1: Top down, 2: 90 degrees anti-clockwise, 3: 90 clockwise
- JustLast
-
JustLast (or not). I have never seen this attribute.
- ReadDir
-
Direction for read.
- BdrStyle
-
Array ref of boder styles : [Left, Right, Top, Bottom]
- BdrColor
-
Array ref of boder color indexes : [Left, Right, Top, Bottom]
- BdrDiag
-
Array ref of diag boder kind, style and color index : [Kind, Style, Color] Kind : 0: None, 1: Right-Down, 2:Right-Up, 3:Both
- Fill
-
Array ref of fill pattern and color indexes : [Pattern, Front Color, Back Color]
- Lock
-
Locked (or not).
- Hidden
-
Hiddedn (or not).
- Style
-
Style format (or Cell format)
Font
Spreadsheet::ParseExcel::Font
Format class has these properties:
- Name
-
Name of that font.
- Bold
-
Bold (or not).
- Italic
-
Italic (or not).
- Height
-
Size (height) of that font.
- Underline
-
Underline (or not).
- UnderlineStyle
-
0: None, 1: Single, 2: Double, 0x21: Single(Account), 0x22: Double(Account)
- Color
-
Color index for that font.
- Strikeout
-
Strikeout (or not).
- Super
-
0: None, 1: Upper, 2: Lower
Formatter class
Spreadsheet::ParseExcel::Fmt*
Formatter class will convert cell data.
Spreadsheet::ParseExcel includes 2 formatter classes: FmtDefault and FmtJapanese. You can create your own FmtClass as you like.
Formatter class(Spreadsheet::ParseExcel::Fmt*) should provide these functions:
- ChkType($oSelf, $iNumeric, $iFmtIdx)
-
tells type of the cell that has specified value.
- TextFmt($oSelf, $sText, $sCode)
-
converts original text into applicatable for Value.
- ValFmt($oSelf, $oCell, $oBook)
-
converts original value into applicatable for Value.
- FmtString($oSelf, $oCell, $oBook)
-
get format string for the $oCell.
KNOWN PROBLEM
This module can not get the values of fomulas in Excel files made with Spreadsheet::WriteExcel. Normaly (ie. By Excel application), formula has the result with it. But Spreadsheet::WriteExcel writes formula with no result. If you set your Excel application "Auto Calculation" off. (maybe [Tool]-[Option]-[Calculation] or something) You will see the same result.
AUTHOR
Kawai Takanori (Hippo2000) kwitknr@cpan.org
http://member.nifty.ne.jp/hippo2000/ (Japanese)
http://member.nifty.ne.jp/hippo2000/index_e.htm (English)
SEE ALSO
XLHTML, OLE::Storage, Spreadsheet::WriteExcel, OLE::Storage_Lite
This module is based on herbert within OLE::Storage and XLHTML.
TODO
- Spreadsheet::ParseExcel : Password protected data, Formulas support, HyperLink support, Named Range support
- Spreadsheet::ParseExcel::SaveParser : Catch up Spreadsheet::WriteExce feature, Create new Excel fle
COPYRIGHT
Copyright (c) 2000-2002 Kawai Takanori All rights reserved.
You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.
ACKNOWLEDGEMENTS
First of all, I would like to acknowledge valuable program and modules : XHTML, OLE::Storage and Spreadsheet::WriteExcel.
In no particular order: Yamaji Haruna, Simamoto Takesi, Noguchi Harumi, Ikezawa Kazuhiro, Suwazono Shugo, Hirofumi Morisada, Michael Edwards, Kim Namusk and many many people + Kawai Mikako.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 1031:
Unknown directive: =cmmt