NAME

Data::Table::Excel - Convert between Data::Table objects and Excel (xls/xlsx) files.

SYNOPSIS

  News: The package now includes "Perl Data::Table Cookbook" (PDF), which may serve as a better learning material.
  To download the free Cookbook, visit https://sites.google.com/site/easydatabase/

  use Data::Table::Excel qw (tables2xls xls2tables tables2xlsx xlsx2tables excelFileToTable is_xlsx xls2xlsx xlsx2xls);

  # read in two CSV tables and generate an Excel .xls binary file with two spreadsheets
  my $t_category = Data::Table::fromFile("Category.csv");
  my $t_product = Data::Table::fromFile("Product.csv");
  # the workbook will contain two sheets, named Category and Product
  # parameters: output file name, an array of tables to write, and their corresponding names
  tables2xls("NorthWind.xls", [$t_category, $t_product], ["Category","Product"]);

  # read in NorthWind.xls file as two Data::Table objects
  my ($tableObjects, $tableNames, $column_headers)=xls2tables("NorthWind.xls");
  for (my $i=0; $i<@$tableNames; $i++) {
    print "*** ". $tableNames->[$i], " ***\n";
    print $tableObjects->[$i]->csv($column_headers[$i]);
  }

  Outputs:
  *** Category ***
  CategoryID,CategoryName,Description
  1,Beverages,"Soft drinks, coffees, teas, beers, and ales"
  2,Condiments,"Sweet and savory sauces, relishes, spreads, and seasonings"
  3,Confections,"Desserts, candies, and sweet breads"
  ...
  
  *** Product ***
  ProductID,ProductName,CategoryID,UnitPrice,UnitsInStock,Discontinued
  1,Chai,1,18,39,FALSE
  2,Chang,1,19,17,FALSE
  3,Aniseed Syrup,2,10,13,FALSE
  ...

  # to deal with Excel 2007 format (.xlsx), use xlsx2tables instead.
  # since no table name is supplied, they will be named Sheet1 and Sheet2.
  # here we also provide custom colors for each sheet, color array is for [OddRow, EvenRow, HeaderRow]

  tables2xlsx("NorthWind.xlsx", [$t_category, $t_product], undef, [['silver','white','black'], [45,'white',37]]);
  # read in NorthWind.xlsx file as two Data::Table objects
  my ($tableObjects, $tableNames)=xlsx2tables("NorthWind.xlsx");
  # note: Spreadsheet::ParseXLSX module is used to parse .xlsx file.

  ($tableObjects, $tableNames, $column_headers)=excelFileToTable("NorthWind.xlsx");
  # excelFileToTable will automatically detect the Excel format for the input file

  # To convert Excel files between the two formats, use
  xlsx2xls("NorthWind.xlsx", "NorthWind.xls");
  xls2xlsx("NorthWind.xls", "NorthWind.xlsx");

ABSTRACT

This perl package provide utility methods to convert between an Excel file and Data::Table objects. It then enables you to take advantage of the Data::Table methods to further manipulate the data and/or export it into other formats such as CSV/TSV/HTML, etc.

DESCRIPTION

    To read and write Excel .xls (2003 and prior) format, we use Spreadsheet::WriteExcel and Spreadsheet::ParseExcel; to read and write Excel .xlsx (2007 format), we use Spreadsheet::ParseXLSX and Excel::Writer::XLSX. If this module gives incorrect results, please check if the corresponding Perl modules are updated. (We switch to Spreadsheet::ParseXLSX from Spreadsheet::XLSX from version 0.5)

    xls2tables ($fileName, $sheetNames, $sheetIndices)

    xlsx2tables ($fileName, $sheetNames, $sheetIndices)

    excelFileToTable ($fileName, $sheetNames, $sheetIndices, $excelFormat)

    xls2tables is for reading Excel .xls files (binary, 2003 and prior), xlsx2table is for reading .xlsx file (2007, compressed XML format). excelFileToTable can automatically detect Excel format if format is not specified.

    $fileName is the input Excel file. $sheetNames is a string or a reference to an array of sheet names. $sheetIndices is a int or a reference to an array of sheet indices. $excelFormat in excelFileToTable has to be either "2003" or "2007". Auto-detected if not specified. If neither $sheetNames or $sheetIndices is provides, all sheets are converted into table objects, one table per sheet. If $sheetNames is provided, only sheets found in the @$sheetNames array is converted. If $sheetIndices is provided, only sheets match the index in the @$sheetIndices array is converted (notice the first spreadsheet has an index of 1).

    The method returns an array ($tableObjects, $tableNames, $columnHeaders). $tableObjects is a reference to an array of Data::Table objects. $tableNames is a reference to an array of sheet names, corresponding to $tableObjects. $columnHeaders is a reference to an array of booleans, indicating whether each table has original column header If a table does not have a column header, columns are named Col1, Col2, etc.

      # print each of spreadsheet into an HTML table on the web
      ($tableObjects, $tableNames, $columnHeaders)=xls2tables("Tables.xls");
      foreach my $t (@$tableObjects) {
        print "<h1>", shift @$tableNames, "</h1><br>";
        print $t->html;
      }
    
      ($tableObjects, $tableNames, $columnHeaders)=xlsx2tables("Tables.xlsx", undef, [1]);

    This will only read the first sheet. By providing sheet names or sheet indicies, you save time if you are not interested in all the sheets.

    tables2xls ($fileName, $tables, $names, $colors, $portrait, $columnHeaders)

    tables2xlsx ($fileName, $tables, $names, $colors, $portrait, $columnHeaders)

    tables2excel ($fileName, $tables, $names, $colors, $portrait, $excelFormat, $columnHeaders)

    table2xls is for writing Excel .xls files (binary, 2003 and prior), xlsx2table is for writing .xlsx file (2007, compressed XML format). tables2excel will export to 2007 format, if $excelFormat is not specified.

    $fileName is used to name the output Excel file. $tables is a reference to an array of Data::Table objects to be write into the file, one sheet per table. $names is a reference to an array of names used to name Spreadsheets, if not provided, it uses "Sheet1", "Sheet2", etc. $colors is a reference to an array of reference to a color array. Each color array has to contains three elements, defining Excel color index for odd rows, even rows and header row. Acceptable color index (or name) is defined by the docs\palette.html file in the CPAN Spreadsheet::WriteExcel package.

    $portrait is a reference to an array of orientation flag (0 or 1), 1 is for Portrait (the default), where each row represents a table row. In landscape (0) mode, each row represents a column. (Similar to Data::Table::html and Data::Table::html2).

    $columnHeaders is a reference to an array of boolean, indicating whether to export column headers for each table. By default, column headers are exported.

    The arrays pointed by $names, $colors, $portraits and $columnHeader should be the same length as that of $tables. these customization values are applied to each table objects sequentially. If a value is missing for a table, the method will use the setting from the previous table.

      tables2xls("TwoTables.xls", [$t_A, $t_B], ["Table_A","Table_B"], [["white","silver","gray"], undef], [1, 0], [1, 1]);

    This will produce two spreadsheets named Table_A and Table_B for table $t_A and $t_B, respectively. The first table is colored in a black-white style, the second is colored by the default style. The first table is the default portrait oritentation, the second is in the transposed orientation.

    is_xlsx($fileName)

    Returns boolean whether the given file is 2007 format. It does not rely on file name, but reads the first two bytes of the file. .xlsx is in Zip format, therefore the first two bytes are "PK".

    xlsx2xls($fromFileName, $toFileName)

    xls2xlsx($fromFileName, $toFileName)

    Converts an Excel file from one format to another. If $toFileName is not specified, $toFileName will be the same as $fromFileName, except with extension sets to .xlsx or .xls.

AUTHOR

Copyright 2008, Yingyao Zhou. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Please send bug reports and comments to: easydatabase at gmail dot com. When sending bug reports, please provide the version of Data::Table::Excel.pm, the version of Perl.

SEE ALSO

  Data::Table.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 297:

You can't have =items (as at line 302) unless the first thing after the =over is an =item