OpenOffice::Parse::SXC - Perl extension for parsing OpenOffice SXC files
use OpenOffice::Parse::SXC qw( parse_sxc ); # Non-OO way: my @rows = parse_sxc( "file.sxc" ); for( @rows ) { print join(",", $_ ),"\n"; } # OO way: package MyDataHandler; # Set up a handler object sub new { my $type = shift; my $self = {}; bless $self, $type; return $self; } sub row { my $self = shift; my $SXC = shift; my $row_data = shift; print $self->{worksheet},": ",join(",", $_),"\n"; # Simple csv values printed... } sub worksheet { my $self = shift; my $SXC = shift; my $worksheet = shift; $self->{worksheet} = $worksheet; } sub workbook { my $self = shift; my $SXC = shift; my $workbook = shift || "unknown_workbook"; } 1; package Main; my $SXC = OpenOffice::Parse::SXC->new( OPTIONS ); $SXC->set_data_handler( MyDataHandler->new ); $SXC->parse_file( "file.sxc" );
OpenOffice::Parse::SXC parses an SXC file (OpenOffice spreadsheet) and passes data back through a callback object that you register with the SXC object.
The major benefit of being able to read directly from an OpenOffice spreadsheet is that it allows SXC files to be directly used as a development tool.
The data returned contains no formatting or formula information, only what text is displayed in the spreadsheet.
This module requires XML::Parser and the compression utility unzip to be installed.
The data that this module will provide you with is exactly the same as what you would see in the OpenOffice application. This could be different than what you entered. For example, this module would provide the results of a function, not the function itself. If you enter 19.95 into a cell, and format that cell as a currency type, you would see $19.95 (for example), and that is what you would get using this module to parse the spreadsheet.
None by default.
Parses an SXC file returning a list of lists containing the cell data.
Quotes a string in "CSV format". The transformation converts each double-quote to two double-quotes, then double-quoting the entire string. All newlines are removed!
Prints out a Dumper'ed version of the entire SXC XML tree. Used for debugging.
Create a new SXC object.
Parse file FILENAME. This method calls parse_file().
Parse the data in filehandle SXC_FILEHANDLE.
Returns the name of the current worksheet. This is only useful to the DATA HANDLER object (ie: during processing)
Gets an option.
Set one or more options
Sets the DATA HANDLER. See the synopsis, and the DATA HANDLER section for details.
Gets the DATA HANDLER.
The following options can be used (in new() or set_options()):
An SXC 'workbook' consists of multiple 'worksheets', (internally refered to as tables) You can specify which worksheets you would like to process, or ALL of them if this option is not used.
If NOT specified, the trailing empty cells in each row will be spliced out.
The DATA HANDLER is what the SXC object calls upon do do work while it parses an SXC file. It expects the DATA HANDLER object to implement the following methods:
Handle row data
Called each time a new worksheet is encountered. Note: there is no callback for when a worksheet ends.
Called each time a new workbook is encountered. (This helps when the same SXC object is used to process multiple files. As with worksheet(), there is no callback for the end of a workbook.
Each method gets the SXC object as the first argument, and the data as the second argument: worksheet gets the name of the worksheet, workbook gets the filename of the SXC file, and row receives a list reference to all the cells in that row.
The interesting callback is the row() function, and often it's the only function of any interest. If you want to avoid creating a class and just want to implement a row() callback, you can do something like this:
sub Whatever::row { my($self, $SXC, $row_data) = @_; print join(",", map { csv_quote( $_ ) } @$row_data ),"\n"; } sub Whatever::worksheet {} sub Whatever::workbook {} $SXC->set_data_handler( bless {}, "Whatever" ); $SXC->parse_file( ... );
Desmond Lee <deslee@shaw.ca>
sxc2csv.
8 POD Errors
The following errors were encountered while parsing the POD:
'=item' outside of any '=over'
You forgot a '=back' before '=head1'
To install OpenOffice::Parse::SXC, copy and paste the appropriate command in to your terminal.
cpanm
cpanm OpenOffice::Parse::SXC
CPAN shell
perl -MCPAN -e shell install OpenOffice::Parse::SXC
For more information on module installation, please visit the detailed CPAN module installation guide.