The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Name

Data::Edit::Conversion - Perform a restartable series of steps in parallel.

Synopsis

Launch the conversion of several files, each represented by a project, in parallel processes, saving the project state after each step of the conversion so that subsequent conversions can be restarted at later steps to speed up development by bypassing initial processing steps unless they are really needed. The data and stepTimes are transferred back from each project's sub process to the main calling process so that the main process can further process their results.

use warnings FATAL=>qw(all);
use strict;
use Test::More tests=>90;
use File::Touch;
use Data::Edit::Conversion;

my $N = 8;                                                                    # Number of test files == projects per launch

makePath(my $inDir = q(in)); clearFolder($inDir, 20);                         # Create and clear folders

my $tAge = File::Touch->new(mtime=>int time - 100);                           # Age file
   $tAge->touch(writeFile(fpe($inDir, $_, q(xml)), <<END)) for 1..$N;         # Create and age $N test files
$_
END

my $convert = sub {my ($p) = @_; $p->data = $p->data =~ s(\s) ()gsr x 2};     # Convert one project

my $l = Data::Edit::Conversion::new                                           # Convert $N projects in parallel
 (projects => Data::Edit::Conversion::loadProjectsFromFolder($inDir,qw(xml)),
  convert  =>
   [[load  => sub {my ($p) = @_; $p->data = readFile($p->source)}],           # Load a project
    [c1    => $convert],
    [c2    => $convert],
    [c3    => $convert],
   ],
  maximumNumberOfProcesses => $N,
 );

my $verify = sub                                                              # Verify launch results
 {my (@stepsExecuted) = @_;                                                   # Steps that should have been executed
  ok $l->projectData($_) eq $_ x 8 for 1..$N;                                 # Check result of each conversion
  is_deeply [sort keys %{$l->projectSteps($_)}], [@stepsExecuted] for 1..$N;  # Check expected steps have been executed
 };

$l->launch;           &$verify(qw(c1 c2 c3 load));                            # Full run
$l->restart(q(load)); &$verify(qw(c1 c2 c3 load));                            # Restart the launch at various points
$l->restart(q(c1));   &$verify(qw(c1 c2 c3));
$l->restart(q(c2));   &$verify(qw(c2 c3));
$l->restart(q(c3));   &$verify(qw(c3));

File::Touch->new(mtime=>int time + 100)->touch(qq($inDir/1.xml));             # Renew source file to force all the steps to be redone despite requesting a restart
$l->restart(q(c2), "After touch");
ok $l->projectData($_) eq $_ x 8 for 1..$N;
is_deeply [sort keys %{$l->projectSteps(1)}], [qw(c1 c2 c3 load)];
is_deeply [sort keys %{$l->projectSteps(2)}], [qw(c2 c3)];

Description

The following sections describe the methods in each functional area of this module. For an alphabetic listing of all methods by name see Index.

Methods

Specify and run the restartable conversion of zero or more files in parallel

new(@)

Create a conversion specification for zero or more files represented by projects.

   Parameter    Description
1  @attributes  L</Launch attributes> describing the launch

This is a static method and so should be invoked as:

Data::Edit::Conversion::new

launch($$$)

Launch the conversion of several files represented by projects in parallel

   Parameter  Description
1  $launch    Launch specification
2  $title     Optional title
3  $restart   Optional name of latest step to restart at.

restart($$$)

Launch the conversion of several files represented by projects in parallel, starting at the specified step: the data from the previous step will be restored unless it does not exist in which case the conversion will be run from the latest step available prior to this step or right from the start.

   Parameter  Description
1  $launch    Launch specification
2  $restart   Step to restart at
3  $title     Optional title

Launch Attributes

Use these attributes to configure a launch.

convert :lvalue

I [[step name => sub]...] A list of steps and their associated subs to process that step. At the end of each step the data stored on data is saved to allow for a later restart at the next step.

maximumNumberOfProcesses :lvalue

I Maximum number of processes to run in parallel

out :lvalue

I Optional file output area. This area will be cleared at the start of each launch.

outFileLimit :lvalue

I Limit on the number of files to be cleared from the out folder at the start of each launch.

projects :lvalue

I A reference to a hash of Data::Edit::Conversion::Project definitions. This can be most easily created by using loadProjectsFromFolder.

save :lvalue

I Temporary files will be stored in this folder

stepNumberByName :lvalue

O Get the number of a step from its name

stepsByNumber :lvalue

O Array of steps to be performed. The subs in this array call the user supplied subs after approriate set up and then do the required set down after the execution of each step.

loadProjectsFromFolder($@)

Create a project for file in and below the specified folder and return the projects created

   Parameter    Description
1  $dir         Folder to search
2  @extensions  List of file extensions to search for

This is a static method and so should be invoked as:

Data::Edit::Conversion::loadProjectsFromFolder

projectData($$)

Get data for a project after a launch has completed

   Parameter     Description
1  $launch       Launch specification
2  $projectName  Project

projectSteps($$)

Get the steps times showing the executed time in seconds for each step in a project after a launch has completed. If a step name is not present in this hash then the step was not run.

   Parameter     Description
1  $launch       Launch specification
2  $projectName  Project

Project

A project is one input file to be converted in one more restartable steps.

new()

Create a project to describe the conversion of a source file containing xml representing documentation into one or more Dita topics.

This is a static method and so should be invoked as:

Data::Edit::Conversion::new

name :lvalue

I Name of project.

number :lvalue

I Number of the project.

source :lvalue

I Input file containing the source xml.

data :lvalue

O Per project data being converted

stepTimes :lvalue

O Hash of steps processed during a launch

title :lvalue

I Title of the project.

Private Methods

defaultMaximumNumberOfProcesses()

Default maximum number of processes to use during the conversion

defaultOutFileLimit()

Default maximum number of files to clear art a time.

stepSaveFile($$$)

Save file for a project and a step

   Parameter     Description
1  $launch       Launch specification
2  $projectName  Project
3  $step         Step name

deleteProject($$$)

Delete results before executing a particular step

   Parameter     Description
1  $launch       Launch specification
2  $projectName  Project
3  $step         Step

saveProject($$$)

Save project at a particular step

   Parameter     Description
1  $launch       Launch specification
2  $projectName  Project
3  $step         Step

loadProject($$$)

Load a project at a particular step

   Parameter     Description
1  $launch       Launch specification
2  $projectName  Project
3  $stepNumber   Step to reload

launchProject($$$)

Convert a single project in a seperate process

   Parameter     Description
1  $launch       Launch specification
2  $projectName  Project to be processed
3  $restart      Optional latest step to restart at

Index

1 convert

2 data

3 defaultMaximumNumberOfProcesses

4 defaultOutFileLimit

5 deleteProject

6 launch

7 launchProject

8 loadProject

9 loadProjectsFromFolder

10 maximumNumberOfProcesses

11 name

12 new

13 number

14 out

15 outFileLimit

16 projectData

17 projects

18 projectSteps

19 restart

20 save

21 saveProject

22 source

23 stepNumberByName

24 stepSaveFile

25 stepsByNumber

26 stepTimes

27 title

Installation

This module is written in 100% Pure Perl and, thus, it is easy to read, comprehend, use, modify and install via cpan:

sudo cpan install Data::Edit::Conversion

Author

philiprbrenan@gmail.com

http://www.appaapps.com

Copyright

Copyright (c) 2016-2018 Philip R Brenan.

This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.