The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Excel::PowerPivot::Utils - utilities for scripting Power Pivot models within Excel workbooks

SYNOPSIS

  use Excel::PowerPivot::Utils;
  my $ppu = Excel::PowerPivot::Utils->new(UTF8 => 1);

  # operations on the whole model ...
  print $ppu->whole_model_as_YAML;
  $ppu->inject_whole_model({QUERIES       => ...,
                            RELATIONSHIPS => ...
                            MEASURES      => ...});

  # .. or specific operations on queries, relationships or measures
  print $ppu->queries_as_YAML;
  $ppu->inject_queries ([{Name => 'New_table', Formula => $M_formula_for_new_table}]);
  
  print $ppu->relationships_as_YAML;
  $ppu->inject_relationships([ {ForeignKey  => 'Album.ArtistId',
                                PrimaryKey  => 'Artist.ArtistId',
                                Active      => 1},
                               ...
                             ]);
  
  print $ppu->measures_as_YAML;
  $ppu->inject_measures([{Name              => 'Invoice Lines Total Amount',
                          AssociatedTable   => 'InvoiceLine',
                          Description       => 'sum of quantities multiplied by unit price',
                          FormatInformation => [qw/Currency  USD 2/],
                          Formula           => 'SUMX(InvoiceLine, InvoiceLine[UnitPrice] * InvoiceLine[Quantity]'
                         },
                         ...
                        ]);

DESCRIPTION

This module uses OLE automation to interact with an Excel Power Pivot model. It can be used for example for

  • documenting existing models

  • scripting series of updates or inserts on measures or queries as batch operations -- useful for propagating similar changes to a series of models.

  • use a version control system on textual exports of the model

Obviously, this module only works on a Windows platform with a local installation of Microsoft Office 2016 or greater.

The exposed interface hides details about the interaction with the Excel object model as documented in https://learn.microsoft.com/en-us/office/vba/excel/concepts/about-the-powerpivot-model-object-in-excel; nevertheless, some knowledge of that model and of the Win32::OLE module is recommended to fully understand what is going on.

CONSTRUCTOR

  my $ppu = Excel::PowerPivot::Utils->new(%options);

Creates a new instance. Options are :

workbook

A Win32::OLE object representing an Excel workbook. If none is supplied, it will connect to the currently running Excel instance and take the active workbook as default.

log

A logger object equipped with debug, info and warning methods. If none is supplied, a simple logger is automatically created from Log::Dispatch.

UTF8

A boolean flag for setting the Win32::OLE codepoint option to UTF8, so that strings are properly encoded/decoded between Perl and the OLE server. It is highly recommended to systematically set this option to true, since this module is mostly used together with YAML, which uses UTF8 encoding.

This option will automatically trigger Win32::OLE->Option(CP => CP_UTF8) at object construction time, and will set it back to the previous value at object destruction time. Beware however that this is a change in global state, so if your program performs other operations through Win32::OLE during the lifetime of the Excel::PowerPivot::Utils object, string handling might be affected. This is the reason why we require this option to be set explicitly instead of being enabled automatically by default.

METHODS

Utilities for Power Query

queries

Returns information about queries in the Excel workbook. This is a list of hashrefs containing

Name

the name of the query

Formula

the M formula (Power Query language)

Description

optional description text

queries_as_YAML

Content of the "queries" method as a YAML string, nicely formatted so that it is easily readable by humans.

inject_queries

  $ppu->inject_queries($queries, %options);

Takes an arrayref or query specifications. Each specification must be a hashref with keys Name, Formula and optionally Description. For names corresponding to queries already in the workbook, this is an update operation; other queries in the list are added to the workbook.

Options are :

delete_others

If true, queries not mentioned in the list are deleted from the workbook. False by default.

handle_connections

If true, queries are automatically associated with workbook connections. This is equivalent to manually checking "Add this data to the Data Model" in the "Close and Load To" dialog of Power Query. True by default. When adding or deleting queries in the model, Excel recomputes the whole model, so this operation may be quite slow.

fast_combine

Activates the FastCombine property -- see https://learn.microsoft.com/en-us/office/vba/api/excel.queries.fastcombine.

delete_connection_for_query

Deletes the OLEDB connection associated with the given query name.

Utilities for relationships

relationships

Returns information about relationships in the Excel model. Due to the inner constraints of Power Pivot, all relationships are many-to-one. The returned structure is a list of hashrefs containing :

ForeignKey

A single string of form $table.$column, describing the "many" side of the relationship.

PrimaryKey

A single string of form $table.$column, describing the "one" side of the relationship.

Active

A boolean stating if the relationship is active or not.

relationships_as_YAML

Content of the "relationships" method as a YAML string, nicely formatted so that it is easily readable by humans.

inject_relationships

  $ppu->inject_relationships($relationships, %options);

Takes an arrayref or relationship specifications. Each specification must be a hashref with keys ForeignKey, PrimaryKey and Active. For pairs (foreign key, primary key) corresponding to relationships already in the model, this is an update operation on the Active property; otherwise the relationships are added to the model.

Options are :

delete_others

If true, relationships not mentioned in the list are deleted from the model. False by default.

Utilities for DAX measures

measures

Returns information about measures in the Excel model. This is a list of hashrefs containing

Name

the name of the measure

AssociatedTable

the name of the table to which this measure is associated

Formula

the DAX formula

Description

optional description text

FormatInformation

an arrayref describing the format for displaying that measure. Formats are documented in https://learn.microsoft.com/en-us/office/vba/api/excel.model. The first member of the array is the name of the ModelFormat object, followed by the values of its properties. The properties for each format are listed in the table below :

  ModelFormat          Property 1           Property 2
  ===========          ==========           ==========
  Currency             Symbol               DecimalPlaces
  Date                 FormatString
  DecimalNumber        UseThousandSeparator DecimalPlaces
  General
  PercentageNumber     UseThousandSeparator DecimalPlaces
  ScientificNumber     DecimalPlaces
  WholeNumber          UseThousandSeparator

measures_as_YAML

Content of the "measures" method as a YAML string, nicely formatted so that it is easily readable by humans.

inject_measures

  $ppu->inject_measures($measures, %options);

Takes an arrayref or measure specifications. Each specification must be a hashref with keys Name, AssociatedTable, Formula and optionally Description and FormatInformation. Values for those keys are strings, except for FormatInformation which takes an arrayref according to the table above.

For names corresponding to measures already in the model, this is an update operation; other measures in the list are added to the model.

Options are :

delete_others

If true, measures not mentioned in the list are deleted from the model. False by default.

Methods on the whole model

whole_model_as_YAML

Returns a single YAML string containing descriptions for queries, relationships and measures.

inject_whole_model

  my $model_to_inject = YAML::Load($whole_model_as_YAML);
  $ppu->inject_whole_model($model_to_inject, %options);

Takes as input a hashref with keys QUERIES, RELATIONSHIPS and MEASURES, and calls methods "inject_queries", "inject_relationships" and "inject_measures" on the corresponding subtrees.

NOTES

  • Unfortunately this module cannot add DAX computed columns to a model table ... because there is no available method for this task in the VBA interface for Excel.

  • For the same reason, there is no support either for manipulating hierarchies (i.e groups of columns in a dimension table).

  • In principle the OLE mechanism allows one to open a connection to an Excel workbook through

      my $workbook = Win32::OLE->GetObject($pathname);

    However, launching Power Query or Power Pivot operations on such connections does not work well -- I experienced several crashes or file corruptions. So the recommended way is to connect to a running Excel instance, and use that connection to open the workbook. The t/02_chinook.t file in this distribution contains a full example; here is the excerpt doing the connection :

      my $xl  = Win32::OLE->GetActiveObject("Excel.Application")
        or skip "can't connect to an active Excel instance";
      my $workbook = $xl->Workbooks->Open($fullpath_xl_file)
        or skip "cannot open OLE connection to Excel file $fullpath_xl_file";

FULL EXAMPLE

File t/02_chinook.t this distribution is a full example dealing with the Chinook database, an open source dataset. A diagram of the relational schema can be seen at https://schemaspy.org/sample/relationships.html.

The test script performs the following operations :

  1. download the sqlite database

  2. generate an Excel file with an Excel table for each database table (through the companion module Excel::ValueWriter::XLSX).

  3. inject the model :

    a)

    Power Queries to connect the Excel tables to the Power Pivot model. Here is an example of the YAML description :

      #======================================================================
      - Name        : Album
      #======================================================================
        Description : 
        Formula     : |-
          let
              Album_Table = Excel.CurrentWorkbook(){[Name="Album"]}[Content],
              #"Modified type" = Table.TransformColumnTypes(Album_Table,{
                  {"AlbumId", Int64.Type},
                  {"Title", type text},
                  {"ArtistId", Int64.Type}})
          in
              #"Modified type"
    b)

    Power Pivot relationships between tables loaded into the model. Here is an example of the YAML description :

      #======================================================================
      - ForeignKey  : Album.ArtistId
        PrimaryKey  : Artist.ArtistId
        Active      : 1
      #======================================================================
    c)

    Power Pivot measures.Since this is just for demonstration purposes, only 3 measures are defined. Here is the YAML description :

      #======================================================================
      - Name              : Invoice Lines Total Amount
      #======================================================================
        AssociatedTable   : InvoiceLine
        Description       : 
        FormatInformation : [Currency, USD, 2]
        Formula           : |-
          SUMX(InvoiceLine, InvoiceLine[UnitPrice] * InvoiceLine[Quantity])
      
      #======================================================================
      - Name              : Invoice Total Amount
      #======================================================================
        AssociatedTable   : Invoice
        Description       : 
        FormatInformation : [Currency, USD, 2]
        Formula           : |-
          SUM(Invoice[Total])
    
    
      #======================================================================
      - Name              : Invoice Lines Percentage Sales
      #======================================================================
        AssociatedTable   : InvoiceLine
        Description       : 
        FormatInformation : [PercentageNumber, 1, 0]
        Formula           : |-
          DIVIDE([Invoice Lines Total Amount], [Invoice Total Amount])
    d)

    Once the Power Pivot model is in place, we can start building pivot tables based on the DAX measures. For this task the Perl module has no added value, it is done through standard OLE automation :

      # create a Pivot Table (percentage of sales per genre, for each customer country)
      my $pcache = $workbook->PivotCaches->Create(xlExternal,
                                                  $workbook->Connections("ThisWorkbookDataModel"));
      my $ptable = $pcache->CreatePivotTable("ComputedPivot!R5C1",
                                             'Sales_by_genre_and_country');
      $ptable->CubeFields("[Measures].[Invoice Lines Percentage Sales]")->{Orientation} = xlDataField;
      $ptable->CubeFields("[Genre].[Name]")                             ->{Orientation} = xlColumnField;
      $ptable->CubeFields("[Customer].[Country]")                       ->{Orientation} = xlRowField;
    e)

    Then it is possible to write Excel formulas that extract values from the pivot cache. So for example here is the formula that retrieves the percentage of sales for the "Classical" genre in Austria :

      = CUBEVALUE("ThisWorkbookDataModel", 
                 "[Measures].[Invoice Lines Percentage Sales]",
                 "[Genre].[Name].[Classical]",
                 "[Customer].[Country].[Austria]")

AUTHOR

Laurent Dami, <dami at cpan.org>

COPYRIGHT AND LICENSE

Copyright 2023 by Laurent Dami.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.