Name
File::Replace - Perl extension for replacing files by renaming a temp file over the original
Synopsis
This module provides three interfaces:
use File::Replace 'replace2';
my ($infh,$outfh) = replace2($filename);
while (<$infh>) {
# write whatever you like to $outfh here
print $outfh "X: $_";
}
close $infh; # closing both handles will
close $outfh; # trigger the replace
Or the more magical single filehandle, in which print
, printf
, and syswrite
go to the output file; binmode
to both; fileno
only reports open/closed status; and the other I/O functions go to the input file:
use File::Replace 'replace';
my $fh = replace($filename);
while (<$fh>) {
# can read _and_ write from/to $fh
print $fh "Y: $_";
}
close $fh;
Or the object oriented:
use File::Replace;
my $repl = File::Replace->new($filename);
my $infh = $repl->in_fh;
while (<$infh>) {
print {$repl->out_fh} "Z: $_";
}
$repl->finish;
Description
This module implements and hides the following pattern for you:
Open a temporary file for output
While reading from the original file, write output to the temporary file
rename
the temporary file over the original file
In many cases, in particular on many UNIX filesystems, the rename
operation is atomic*. This means that in such cases, the original filename will always exist, and will always point to either the new or the old version of the file, so a user attempting to open and read the file will always be able to do so, and never see an unfinished version of the file while it is being written.
* Warning: Unfortunately, whether or not a rename will actually be atomic in your specific circumstances is not always an easy question to answer, as it depends on exact details of the operating system and file system. Consult your system's documentation and search the Internet for "atomic rename" for more details. This module's job is to perform the rename
, and it can make no guarantees as to whether it will be atomic or not.
Version
This documentation describes version 0.06 of this module.
Constructors and Overview
The functions File::Replace->new()
, replace()
, and replace2()
take exactly the same arguments, and differ only in their return values - replace
and replace2
wrap the functionality of File::Replace
inside tie
d filehandles. Note that replace()
and replace2()
are normal functions and not methods, don't attempt to call them as such. If you don't want to import them you can always call them as, for example, File::Replace::replace()
.
File::Replace->new( $filename );
File::Replace->new( $filename, $layers );
File::Replace->new( $filename, option => 'value', ... );
File::Replace->new( $filename, $layers, option => 'value', ... );
# replace(...) and replace2(...) take the same arguments
The constructors will open the input file and the temporary output file (the latter via File::Temp), and will die
in case of errors. The options are described in "Options". It is strongly recommended that you use warnings;
, as then this module will issue warnings which may be of interest to you.
File::Replace->new
use File::Replace;
my $replace_object = File::Replace->new($filename, ...);
Returns a new File::Replace
object. The central methods provided are ->in_fh
and ->out_fh
, which return the input resp. output filehandle which you can read resp. write, and ->finish
, which causes the files to be closed and the replace operation to be performed. There is also ->cancel
, which just discards the temporary output file without touching the input file. Additional helper methods are mentioned below.
finish
will die
on errors, while cancel
will only return a false value on errors. This module will try to clean up after itself (remove temporary files) as best it can, even when things go wrong.
Please don't re-open
the in_fh
and out_fh
handles, as this may lead to confusion.
The method ->is_open
will return a false value if the replace operation has been finish
ed or cancel
ed, or a true value if it is still active. The method ->filename
returns the filename passed to the constructor. The method ->options
in list context returns the options this object has set (including defaults) as a list of key/value pairs, in scalar context it returns a hashref of these options.
replace
use File::Replace 'replace';
my $magic_handle = replace($filename, ...);
Returns a single, "magical" tied filehandle. The operations print
, printf
, and syswrite
are passed through to the output filehandle, binmode
operates on both the input and output handle, and fileno
only reports -1
if the File::Replace
object is still active or undef
if the replace operation has finish
ed or been cancel
ed. All other I/O functions, such as <$handle>
, readline
, sysread
, seek
, tell
, eof
, etc. are passed through to the input handle. You can still access these operations on the output handle via e.g. eof( tied(*$handle)->out_fh )
or tied(*$handle)->out_fh->tell()
. The replace operation (finish
) is performed when you close
the handle, which means that close
may die
instead of just returning a false value.
Re-open
ing the handle causes a new underlying File::Replace
object to be created. You should explicitly close
the filehandle first so that the previous replace operation is performed (or cancel
that operation). The "mode" argument (or filename in the case of a two-argument open
) may not contain a read/write indicator (<
, >
, etc.), only PerlIO layers.
You can access the underlying File::Replace
object via tied(*$handle)->replace
. You can also access the original, untied filehandles via tied(*$handle)->in_fh
and tied(*$handle)->out_fh
, but please don't close
or re-open
these handles as this may lead to confusion.
replace2
use File::Replace 'replace2';
my ($input_handle, $output_handle) = replace2($filename, ...);
my $output_handle = replace2($filename, ...);
In list context, returns a two-element list of two tied filehandles, the first being the input filehandle, and the second the output filehandle, and the replace operation (finish
) is performed when both handles are close
d. In scalar context, it returns only the output filehandle, and the replace operation is performed when this handle is close
d. This means that close
may die
instead of just returning a false value.
You cannot re-open
these tied filehandles.
You can access the underlying File::Replace
object via tied(*$handle)->replace
on both the input and output handle. You can also access the original, untied filehandles via tied(*$handle)->in_fh
and tied(*$handle)->out_fh
, but please don't close
or re-open
these handles as this may lead to confusion.
Options
Filename
A filename. The temporary output file will be created in the same directory as this file, its name will be based on the original filename, but prefixed with a dot (.
) and suffixed with a random string and an extension of .tmp
. If the input file does not exist (ENOENT
), then the behavior will depend on the "create" option.
layers
This option can either be specified as the second argument to the constructors, or as the layers => '...'
option in the options hash, but not both. It is a list of PerlIO layers such as ":utf8"
, ":raw:crlf"
, or ":encoding(UTF-16)"
. Note that the default layers differ based on operating system, see "open" in perlfunc.
create
This option configures the behavior of the module when the input file does not exist (ENOENT
). There are three modes, which you specify as one of the following strings. If you need more precise control of the input file, see the "in_fh" option - note that create
is ignored when you use that option.
"later"
(default whencreate
omitted)-
Instead of the input file, /dev/null or its equivalent is opened. This means that while the output file is being written, the input file name will not exist, and only come into existence when the rename operation is performed.
"now"
-
If the input file does not exist, it is immediately created and opened. There is currently a potential race condition: if the file is created by another process before this module can create it, then the behavior is undefined - the file may be emptied of its contents, or you may be able to read its contents. This behavior may be fixed and specified in a future version. The race condition is discussed some more in "Concurrency and File Locking".
Currently, this option is implemented by opening the file with a mode of
+>
, meaning that it is created (clobbered) and opened in read-write mode. However, that should be considered an implementation detail that is subject to change. Do not attempt to take advantage of the read-write mode by writing to the input file - that contradicts the purpose of this module anyway. Instead, the input file will exist and remain empty until the replace operation. "off"
(or"no"
)-
Attempting to open a nonexistent input file will cause the constructor to
die
.
The above values were introduced in version 0.06. Using any other than the above values will trigger a mandatory deprecation warning. For backwards compatibility, if you specify any other than the above values, then a true value will be the equivalent of now
, and a false value the equivalent of later
. The deprecation warning will become a fatal error in a future version, to allow new values to be added in the future.
The devnull
option has been deprecated as of version 0.06. Its functionality has been merged into the create
option. If you use it, then the module will operate in a compatibility mode, but also issue a mandatory deprecation warning, informing you what create
setting to use instead. The devnull
option will be entirely removed in a future version.
in_fh
This option allows you to pass an existing input filehandle to this module, instead of having the constructors open the input file for you. Use this option if you need more precise control over how the input file is opened, e.g. if you want to use sysopen
to open it. The handle must be open, which will be checked by calling fileno
on the handle. The module makes no attempt to check that the filename you pass to the module matches the filehandle. The module will attempt to stat
the handle to get its permissions, except when you have specified the "perms" option or disabled the "chmod" option. The "create" option is ignored when you use this option.
perms
perms => 0640 # ok
perms => oct("640") # ok
perms => "0640" # WRONG!
Normally, just before the rename
is performed, File::Replace
will chmod
the temporary file to those permissions that the original file had when it was opened, or, if the original file did not yet exist, default permissions based on the current umask
. Setting this option to an octal value (a number, not a string!) will override those permissions. See also "chmod", which can be used to disable the chmod
operation.
chmod
This option is enabled by default, unless you set $File::Replace::DISABLE_CHMOD
to a true value. When you disable this option, the chmod
operation that is normally performed just before the rename
will not be attempted. This is mostly intended for systems where you know the chmod
will fail. See also "perms", which allows you to define what permissions will be used.
Note that the temporary files created with File::Temp will have 0600 permissions if left unchanged (except of course on systems that don't support these kind of restrictive permissions).
autocancel
If the File::Replace
object is destroyed (e.g. when it goes out of scope), and the replace operation has not been performed yet, normally it will cancel
the replace operation and issue a warning. Enabling this option makes that implicit canceling explicit, silencing the warning.
This option cannot be used together with autofinish
.
autofinish
When set, causes the finish
operation to be attempted when the object is destroyed (e.g. when it goes out of scope).
However, using this option is actually not recommended unless you know what you are doing. This is because the replace operation will also be attempted when your script is die
ing, in which case the output file may be incomplete, and you may not want the original file to be replaced. A second reason is that the replace operation may be attempted during global destruction, and it is not a good idea to rely on this always going well. In general it is better to finish
the replace operation explicitly.
This option cannot be used together with autocancel
.
debug
If set to a true value, this option enables some debug output for new
, finish
, and cancel
. You may also set this to a filehandle, and debug output will be sent there.
Notes and Caveats
Concurrency and File Locking
This module is very well suited for situations where a file has one writer and one or more readers.
Among other things, this is reflected in the case of a nonexistent file, where the "create" settings now
and later
(the default) are currently implemented as a two-step process, meaning there is the potential of the input file being created in the short period of time between the first and second open
attempts, which this module currently will not notice.
Having multiple writers is possible, but care must be taken to ensure proper coordination of the writers!
For example, a simple flock of the input file is not enough: if there are multiple processes, remember that each process will replace the original input file by a new and different file! One possible solution would be a separate lock file that does not change and is only used for flock
ing. There are other possible methods, but that is currently beyond the scope of this documentation.
(For the sake of completeness, note that you cannot flock
the tie
d handles, only the underlying filehandles.)
Author, Copyright, and License
Copyright (c) 2017 Hauke Daempfling (haukex@zero-g.net) at the Leibniz Institute of Freshwater Ecology and Inland Fisheries (IGB), Berlin, Germany, http://www.igb-berlin.de/
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.