File::CleanupTask - Delete/Backup files on a task-based configuration


Version 0.09


    use File::CleanupTask;

    my $cleanup = File::Cleanup->new({
        conf => "/path/to/tasks_file.tasks",
        taskname => "TASK_LABEL_IN_TASKFILE",


Once run() is called, the cleanup operation 'TASK_LABEL_IN_TASKFILE' specified in tasks_file.tasks is performed.


A .tasks file is a text file in which one or more cleanup tasks are specified. Each task has a label and a list of options specified as shown in the following example:

    path                = '/home/savio/results/'
    backup_path         = '/home/savio/old_results/'
    backup_gzip             = 1
    max_days                = 3
    recursive               = 1
    prune_empty_directories = 1
    keep_if_linked_in       = '/home/savio/results/'

    path = 'C:\\this\\is\\a\\windows\\path'

In this case, [TASK_LABEL_IN_TASKFILE] is the name of the cleanup task to be executed.

The following options can be specified under a task label:


The path to the directory containing the files to be deleted or removed. Note that in MS Windows the backslashes of a path names should be escaped and apostrophese are strictly needed when specifying a path name (see example above).


If specified, will cause files to be moved in the specified directory instead of being deleted. If backup_path doesn't exist, it will be created. Symlinks are not backed up. The files are backed up at the toplevel of backup_path in a .gz (or .tgz, depending on backup_gzip) archive, which preserves pathnames of the archived files.


If set to "1", will gzip the files saved in backup_path. The resulting archive will preserve the pathname of the original file, and will be relative to 'path'.

For example, given the following configuration:

   path = /path/to/cleanup/
   backup_path = /path/to/backup/
   backup_gzip = 1

If /path/to/cleanup/my/target/file.txt is encountered, and it's old, it will be backed up in /path/to/backup/file.txt.gz. Uncompressing file.txt.gz using /path/to/backup as current working directory will result in:



The number of maximum days within which the files in the cleanup directories are kept. If a file is older than the specified number of days, it is queued for deletion.

For example, max_days = 3 will delete files older than 3 days from the cleanup directory.

max_days defaults to 0 if it isn't specified, meaning that all the files are to be deleted.


If set to 0, only files within "path" can be deleted/backed up. If set to 1, files located at any level within "path" can be deleted.


If set to 1, empty directories will be deleted regardless their age.


A pathname to a directory that may contain symlinks. If specified, it will prevent deletion of files and directories within path that are symlinked in this directory, regardless their age.

This option will be ignored in MS Windows or in other operating systems that don't support symlinks.


A regular expression that defines a pattern to look for. Any pathname matching this pattern will not be erased, regardless their age. The regular expression applies to the full pathname of the file or directory.


If set to 1, immediate subfolders in path will be deleted only if all the files in it are deleted.


If specified, will apply any potential delete or backup action to the files that match the pattern. Any other file will be left untouched.

If set to 1, the symlinks inside 'path' will be deleted only if their target will be deleted. This option is disabled by default, which means that the target of symlinks within the path will not be questioned during deletion/backup, they will be just treated as regular files.

This option will be ignored in MS Windows or in other operating systems that don't support symlinks.



Create and configure a new File::CleanupTask object.

The object must be initialised as follows:

    my $cleanup = File::Cleanup->new({
        conf => "/path/to/tasks_file.tasks",
        taskname => 'TASK_LABEL_IN_TASKFILE',


Given the arguments specified in the command line, processes them, creates a new File::CleanupTask object, an then calls run.

Options include dryrun, verbose, task and conf.

dryrun: just build and show the plan, nothing will be executed or deleted.
verbose: produce more verbose output.
task: optional, will result in the execution of the specified task.
path: the path to the .tasks configuration file.


Perform the cleanup


Run a single cleanup task given its configuration and name. The name is used as a label for possible output and is an optional parameter of this method.

This will scan all files and directories in path in a depth first fashion. If a file is encountered a target action is performed based on the state of that file (file or directory, symlinked, old, empty directory...).

verbose, dryrun

Accessors that will tell you if running in dryrun or verbose mode.


Builds a delete_once_empty of pathnames, each of which should be deleted only if all its files are also deleted.


Builds a never_delete list of pathnames that shouldn't be deleted at any condition.


Adds a path to the given never_delete list.


Checks if the given path is contained in the delete_once_empty


Adds a path to the given delete_once_empty.


Checks if the given path is contained in the never_delete.


Checks up the given path, and returns its absolute representation.


Plans the actions to be executed on the files in the target path according to:

 - options in the configuration
 - the target files
 - the never_delete

All files in the never_delete list can't be deleted.


Given a path to a file and the task configuration options, augment the plan with actions to take on that file.

Returns the array containing one or more actions performed.

These actions are meant to be performed in reverse sequence on the given file. An empty array_ref is returned if no action is to be performed on the given file.

A returned action can be one of: delete, backup.

Resulting actions are decided according to one or more of the followings:

 - options in the configuration
 - the target files
 - the never_delete

This method works under the assumption that the specified file or directory exists and the user has full permissions on it.


Adds the given action to the plan.


Returns 1 if the given folder is empty.


Execute a plan based on the given task options. Blacklist is passed to make sure once again that no unwanted files or directories are deleted.


Takes into account symlinks in the current plan.

The refinement is done in the following way:

1) Go through the plan, and look for symlink targets.

2) Mark any symlink with as the action of it's target if it's in the cleanup directory: keep the symlink if its target is kept, delete otherwise (broken symlinks, or pointing outside the cleanup, target is being backupped...). While deciding this, build an hashref of { symlink_parent (canonical) => symlink_path (non_canonical) }.

3) Add the symlink to the plan in the correct position. To do this, build another 'refined' plan. - go hrough the pathnames (visits parents first) in the plan, pop each item. - if the parent of a marked symlink is found, do the following: * mark it as 'delete' if the symlink is going to be deleted. or mark it as 'nothing' if the symlink is not going to be deleted. * push the parent in the refined plan. * push the symlink in the refined plan.

4) Fix the plan to have consistent state (bubble up states between pairs of directories)

Return the refined plan.

Get the parent path of a given path. This method only accesses the disk if the f_path is found to have no parent directory (i.e., just the relative file name has been specified). In this case, we check that the current working directory contains the given file. If yes, we return the current working directory as the parent of the specified file. If not, we return undef.

Given a path to a symlink and a hash reference, keep the symlink target as a key of the hash reference (canonical path), and the path to the symlink (non canonical) as the corresponding value. Because multiple symlinks can point to the same target, the value of this hashref is an arrayref of symlinks paths.

Returns true on success, or false if a path to something else than a symlink is passed to this method.


Refine a pattern passed from the configuration.

Currently applyes the following transformation: - Remove any "/" in case the user has specified a pattern in the form of /pattern/.


Savio Dimatteo, <savio at>


Please report any bugs or feature requests to bug-file-cleanuptask at, or through the web interface at I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.


You can find documentation for this module with the perldoc command.

    perldoc File::CleanupTask

You can also look for information at:


Thanks Alex for devising the original format of a .tasks file and offering me the opportunity to publish this work on CPAN.

Thanks Mike for your feedback about canonical paths detection.

Thanks David for reviewing the code.

Thanks for helping me choosing the name of this module.


Copyright 2012 Savio Dimatteo.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See for more information.