The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WWW::Analytics::MultiTouch - Multi-touch web analytics, using Google Analytics

SYNOPSIS

    use WWW::Analytics::MultiTouch;

    # Simple, all-in-one approach
    WWW::Analytics::MultiTouch->process(id => $analytics_id,
                                        start_date => '2010-01-01',
                                        end_date => '2010-02-01',
                                        filename => 'report.xls');

    # Or step by step
    my $mt = WWW::Analytics::MultiTouch->new(id => $analytics_id);
    $mt->get_data(start_date => '2010-01-01',
                  end_date => '2010-02-01');

    $mt->summarise(window_length => 45);
    $mt->report(filename => 'report-45day.xls');
    
    $mt->summarise(window_length => 30);
    $mt->report(filename => 'report-30day.xls');

DESCRIPTION

This module provides reporting for multi-touch web analytics, as described at http://www.multitouchanalytics.com.

Unlike typical last-session attribution web analytics, multi-touch gives insight into all of the various marketing channels to which a visitor is exposed before finally making the decision to buy.

Multi-touch analytics uses a javascript library to send information from a web user's browser to Google Analytics for raw data collection; this module uses the Google Analytics API to collate the data and then summarises it in a spreadsheet, showing (for example):

  • Summary of marketing channels and number of transactions to which each channel had some contribution (sum of transactions > total transactions)

  • Summary of channels and fair attribution of transactions (sum of transactions = total transactions)

  • First touch, last touch, fifty-fifty first/last touch, and even attribution of transactions.

  • Overlap analysis

  • Transaction/touch distribution

  • List of each transaction and the contributing channels

GOOGLE ACCOUNT AUTHORISATION

In order to give permission for the multitouch reporting to access your data, you must follow the authorisation process. On first use, a URL will be displayed. You must click on this URL or cut and paste it into a browser, log in as the Google user that has access to the Google Analytics profile that you wish to analyse, grant permission, and paste the resulting authorisation code into the console. After this, the authorisation tokens will be stored and there should be no need to repeat the process.

In case you need to change user or profile or re-authenticate, see the information on the auth_file option.

BASIC USAGE

process

    WWW::Analytics::MultiTouch->process(%options)

The process() function integrates all of the steps required to generate a report into one, i.e. it creates a WWW::Analytics::MultiTouch object, fetches data from the Google Analytics API, summarises the data and generates a report.

Options available are all of the options for new, get_data, summarise and report. Minimum options are id, and typically start_date, end_date and filename.

Typically the most time consuming part of the process is fetching the data from Google. The process() function is suitable if only one set of parameters is to be used for the reports; to generate multiple reports using, for example, different attribution windows, it is more efficient to use the full API to fetch the data once and then run all the needed reports.

METHODS

new

  my $mt = WWW::Analytics::MultiTouch->new(%options)

Creates a new WWW::Analytics::MultiTouch object.

Options are:

  • id

    This is the Google Analytics reporting ID. This parameter is mandatory. This is NOT the ID that you use in the javascript code! You can find the reporting id in the URL when you log into the Google Analytics console; it is the number following the letter 'p' in the URL, e.g.

      https://www.google.com/analytics/web/#dashboard/default/a111111w222222p123456/

    In this example, the ID is 123456.

  • auth_file

    This is the file in which authentication keys received from Google are kept for subsequent use. The default filename is derived from the configuration file (look for a file in the same directory as the configuration file ending in '.auth'). You may specify an alternative filename if you wish.

    The auth_file will be created on initial usage when authorisation keys are received from Google. If you need to change the Google username, or re-authorise the software for any other reason, delete the auth_file or specify an auth_file of a different name that does not exist. Then the initial authorisation process will be repeated and a new auth_file will be created.

  • event_category

    The name of the event category used in Google Analytics to store multi-touch data. Defaults to 'multitouch' and only needs to be changed if the equivalent variable in the associated javascript library has been customised.

  • fieldsep, recsep

    Field and record separators for stored multi-touch data. These default to '!' and '*' respectively and only need to be changed if the equivalent variables in the associated javascript library has been customised.

  • patsep

    The pattern separator for turning source, medium and subcategory information into a "channel" identifier. See the channel_pattern option under summarise for more information. Defaults to '-'.

  • channel_map

    This is a hashref of channel name (after applying channel_pattern) that maps the extracted name to a more friendly name. For example, if channel_pattern is 'med-subcat', then direct traffic appears as '(none)-(none), organic traffic as organic-(none), etc. An appropriate channel_map might be:

        channel_map => {
                          '(none)-(none)' => 'Direct',
                          'organic-(none)' => 'Organic'
                       }
  • date_format, time_format

    The format to be used for printing dates and times, respectively, using strftime patterns. See "strftime Patterns" in DateTime for details. Defaults are '%d %b %Y' (e.g. 1 Jan 2010) and '%Y-%m-%d %H:%M:%S' (e.g. 2010-01-01 01:00:00).

  • ga_timezone, report_timezone

    Timezone used by Google Analytics, and timezone to be used in the reports, respectively. May be specified either as an Olson DB time zone name ("America/Chicago", "UTC") or an offset string ("+0600"). Default is UTC for both.

  • revenue_scale

    Scaling factor for revenue amounts. Useful if, for example, you wish to display revenue in thousands of dollars instead of dollars.

  • debug

    Enable debug output.

get_data

  $mt->get_data(%options)

Get data via the Google Analytics API.

Options are:

  • start_date, end_date

    Start and end dates respectively. The total interval includes both start and end dates. Date format is YYYY-MM-DD or YYYYMMDD. (These dates are with respect to the report timezone).

summarise

  $mt->summarise(%options)

Summarise data.

Options are:

  • window_length

    The analysis window length, in days. Only touches this many days prior to any given order will be included in the analysis.

  • single_order_model

    If set, any touch is counted only once, toward the next order only; subsequent repeat orders do not include touches prior to the initial order.

  • channel_pattern

    Each "channel" is derived from the Google source (source), Google medium (med) and a subcategory (subcat) field that can be set in the javascript calls, joined using the pattern separator patsep (defined in new, default '-').

    For example, the source might be 'yahoo' or 'google' and the medium 'organic' or 'cpc'. To see a report on channels yahoo-organic, google-organic, google-cpc etc, the channel pattern would be 'source-med'. To see the report just at the search engine level, channel pattern would be 'source', and to see the report just at the medium level, the channel pattern would be 'med'.

    Arbitrary ordering is permissible, e.g. med-subcat-source.

    The default channel pattern is 'source-med-subcat'.

  • channel

    A hashref containing channel-specific options. This is a mapping from channel name (the friendly name, given in the channel_map) to option hash.

    Currently the only option is 'requires_first_touch' (boolean). If set, a transaction will only be attributed to the channel if it received the first touch in the analysis window. This is mainly used to correct for over-attribution to the direct channel. Example:

      {
        Direct => { requires_first_touch => 1 }
      }
  • adjustments

    A hashref containing transactions and revenue corrections that may be applied for a given day. This allows, for example, compensation for lost data for short periods of time. A typical form of the adjustments hash is:

      {
        2010-09-01 => { transactions => 1.4, revenue => 1.3 }
      }

    which would apply correction factors of 1.4 and 1.3 for transaction counts and revenue respectively for any transactions occurring on the date 2010-09-01.

report

  $mt->report(%options)

Generate reports.

Report Type Options

  • all_touches_report

    If set, the generated report includes the all-touches report; enabled by default. The all-touches report shows, for each channel, the total number of transactions and the total revenue amount in which that channel played a role. Since multiple channels may have contributed to each transaction, the total of all transactions across all channels will exceed the actual number of transactions.

  • even_touches_report

    If set, the generated report includes the even-touches report; enabled by default. The even-touches report shows, for each channel, a number of transactions and revenue amount evenly distributed between the participating channels. For example, if Channel A has 3 touches and Channel B 2 touches, half of the revenue/transactions will be allocated to Channel A and half to Channel B. Since each individual transaction is evenly distributed across the contributing channels, the total of all transactions (revenue) across all channels will equal the actual number of transactions (revenue).

  • distributed_touches_report

    If set, the generated report includes the distributed-touches report; enabled by default. The distributed-touches report shows, for each channel, a number of transactions and revenue amount in proportion to the number of touches for that channel. Since each individual transaction is distributed across the contributing channels, the total of all transactions (revenue) across all channels will equal the actual number of transactions (revenue).

  • first_touch_report

    If set, the generated report includes the first-touch report; enabled by default. The first-touch report allocates transactions and revenue to the channel that received the first touch within the analysis window.

  • last_touch_report

    If set, the generated report includes the last-touch report; enabled by default. The last-touch report allocates transactions and revenue to the channel that received the last touch prior to the transaction.

  • fifty_fifty_report

    If set, the generated report includes the fifty-fifty report; enabled by default. The fifty-fifty report allocates transactions and revenue equally between first touch and last touch contributors.

  • transactions_report

    If set, the generated report includes the transactions report; enabled by default. The transactions report lists each transaction and the channels that contributed to it.

  • touchlist_report

    If set, the generated report includes the touchlist report; enabled by default. The touchlist report lists the touches for each transaction in chronological order. Note that this can be a very large amount of data compared to other reports.

  • transaction_distribution_report

    If set, the generated report includes the transaction distribution report; enabled by default. The transaction distribution report shows the number of transactions that had one touch, two touches, etc, both by channel and as a total.

  • channel_overlap_report

    If set, the generated report includes the channel overlap report; enabled by default. The channel overlap report shows the number of transactions that were touched by 1 channel, 2 channels, etc, and the number of transactions by channel combination.

Report Output Options

  • filename

    Name of file in which to save reports. If not specified, output is sent to STDOUT. The filename extension, if given, is used to determine the file format, which can be xls, csv or txt.

  • format

    May be set to xls, csv or txt to specify Excel, CSV and Text format output respectively. The filename extension takes precedence over this parameter.

Report Formatting Options

  • title

    Title to insert into reports.

  • column_heading_format

    The cell format (see "CELL FORMATS" in WWW::Analytics::MultiTouch) to use for column headings.

  • column_formats

    An array of one or more cell formats (see "CELL FORMATS" in WWW::Analytics::MultiTouch) to use in a round-robin manner across the columns of the data.

  • header_layout, footer_layout

    Page headers and footers. See "HEADERS AND FOOTERS" in WWW::Analytics::MultiTouch for details.

  • strict_integer_values

    If set, transactions and revenue will be reported in integer formats. Where a reasonable number of transactions are being counted, the fractional part of the transaction count in a distributed transactions report is rarely of consequence, and for some the concept of a fractional transaction attribution can be a distraction from the key messages of these reports, so this option helps to keep it simple.

  • heading_map

    A mapping from default report headings to custom report headings. For example,

      heading_map => { 
                       Transactions => 'Distributed Transactions',
                       'Revenue' => 'Distributed Revenue (US$)'
                     }

For every type of report, (all_touches_report, first_touch_report, transactions_report, etc), report-specific formatting options can be given in a hashref with corresponding name, e.g. 'all_touches', 'first_touch', 'transactions'. For example,

  column_heading_format => { 
                             bold => 1,
                             color => 'white',
                             bg_color => 'gray',
                             right => 'white',
                           },
  column_format => [ 
                              { bg_color => '#D0D0D0', },
                              { bg_color => '#E8E8E8', },
                           ],

  all_touches => {
    column_heading_format => { 
                               color => 'blue'
                             }
                 }
      

The report-specific options are merged with the top level options and then used.

all_touches_report

even_touches_report

distributed_touches_report

first_touch_report

last_touch_report

fifty_fifty_report

touchlist_report

transaction_distribution_report

channel_overlap_report

These implement the individual reports, taking options similar to those described under report above.

DEVELOPER METHODS

As well as the user API methods list above, there are also a number of methods that have been exposed as part of the API for developer purposes, e.g. for developing subclasses to override specific functionality, or for integrating into systems other than Google Analytics.

set_data

    $mt->set_data(start_date => 20100101,
                  end_date => 20100130,
                  transactions => \%transactions,
    );

Instead of invoking get_data to retrieve data from Google Analytics, it is also possible to set data directly - e.g. data collected through another mechanism, or data from Google Analytics that has been saved to file. set_data allows you to directly specify the data for subsequent analysis.

Parameters 'start_date' and 'end_date' are used for reporting and should have the form YYYYMMDD.

Parameter 'transactions' is a hash of transaction ID to [date (YMD), list of touches] (see split_events for description).

split_events

    @events = $mt->split_events($cookie_value)

Splits event label (i.e. 'multitouch' cookie value) into a list comprising order and touch arrayrefs. A touch has format [ source, medium, subcat, time ] and an order has format [ '__ORD', transactionID, revenue, time ].

condition_entry

    ($conditioned_key, $conditioned_touches) = $self->condition_entry($key, \@touches)

condition_entry is called by get_data for each data entry retrieved from Google Analytics. It is possibly useful for subclasses to override, in case any special data conditioning is required. $key is the event label (transaction ID) and @touches is the list of touches, each touch being an array as described under split_events. Conditioning might include removal of duplicates, or normalisation of transaction IDs.

parse_config

    $opts = WWW::Analytics::MultiTouch->parse_config($opts, $config_file)

Parses $config_file and merges options with $opts.

RELATED INFORMATION

See http://www.multitouchanalytics.com for further details.

AUTHOR

Jon Schutz, <jon at jschutz.net>

BUGS

Please report any bugs or feature requests to bug-www-analytics-multitouch at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=WWW-Analytics-MultiTouch. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc WWW::Analytics::MultiTouch

You can also look for information at:

COPYRIGHT & LICENSE

 Copyright 2010 YourAmigo Ltd.

 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:

 The above copyright notice and this permission notice shall be included in
 all copies or substantial portions of the Software.

 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 THE SOFTWARE.