The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

File::BOM::Utils - Check, Add and Remove BOMs

Synopsis

This is scripts/synopsis.pl:

        #!/usr/bin/env perl

        use strict;
        use warnings;

        use File::BOM::Utils;
        use File::Spec;

        # -------------------

        my($bommer)    = File::BOM::Utils -> new;
        my($file_name) = File::Spec -> catfile('data', 'bom-UTF-8.xml');

        $bommer -> action('test');
        $bommer -> input_file($file_name);

        my($report) = $bommer -> file_report;

        print "BOM report for $file_name: \n";
        print join("\n", map{"$_: $$report{$_}"} sort keys %$report), "\n";

Try 'bommer.pl -h'. It is installed automatically when the module is installed.

Description

File::BOM::Utils provides a means of testing, adding and removing BOMs (Byte-Order-Marks) within files.

It also provides two hashes accessible from outside the module, which convert in both directions between BOM names and values. These hashes are called %bom2name and %name2bom.

See also bommer.pl, which is installed automatically when the module is installed.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation

Install File::BOM::Utils as you would any Perl module:

Run:

        cpanm File::BOM::Utils

or run:

        sudo cpan File::BOM::Utils

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

Constructor and Initialization

new() is called as my($parser) = File::BOM::Utils -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type File::BOM::Utils.

Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. "action([$string])"]):

o action => $string

Specify the action wanted:

o add

Add the BOM named with the bom_name option to input_file. Write the result to output_file.

o remove

Remove any BOM found from the input_file. Write the result to output_file.

The output is created even if the input file has no BOM, in order to not violate the Principle of Least Surprise.

o test

Print the BOM status of input_file.

The methods "bom_report([%opt])" and "file_report([%opt])" return hashrefs if you wish to avoid printed output.

Default: ''.

A value for this option is mandatory.

Note: As syntactic sugar, you may specify just the 1st letter of the action. And that's why test is called test and not report.

o bom_name => $string

Specify which BOM to add to input_file.

This option is mandatory if the action is add.

Values (always upper-case):

o UTF-32-BE
o UTF-32-LE
o UTF-16-BE
o UTF-16-LE
o UTF-8

Default: ''.

Note: These names are taken from the test data for XML::Tiny.

o input_file => $string

Specify the name of the input file. It is read in :raw mode.

A value for this option is mandatory.

Default: ''.

o output_file => $string

Specify the name of the output file for when the action is add or remove. It is written in :raw mode.

And yes, it can be the same as the input file, but does not default to the input file. That would be dangerous.

This option is mandatory if the action is add or remove.

Default: ''.

Methods

action([$string])

Here, the [] indicate an optional parameter.

Gets or sets the action name, as a string.

If you supplied an abbreviated (1st letter only) version of the action, the return value is the full name of the action.

action is a parameter to "new([%opt])".

add([%opt])

Here, the [] indicate an optional parameter.

Adds a named BOM to the input file, and writes the result to the output file.

Returns 0.

%opt may contain these (key => value) pairs:

o bom_name => $string

The name of the BOM.

The names are listed above, under "Constructor and Initialization".

o input_file => $string
o output_file => $string

bom_name([$string])

Here, the [] indicate an optional parameter.

Gets or sets the name of the BOM to add to the input file as that file is copied to the output file.

The names are listed above, under "Constructor and Initialization".

bom_name is a parameter to "new([%opt])".

bom_report([%opt])

Here, the [] indicate an optional parameter.

Returns a hashref of statitics about the named BOM.

%opt may contain these (key => value) pairs:

o bom_name => $string

The hashref returned has these (key => value) pairs:

o length => $integer

The # of bytes in the BOM.

o name => $string

The name of the BOM.

The names are listed above, under "Constructor and Initialization".

o value => $integer

The value of the named BOM.

bom_values()

Returns an array of BOM values, sorted from longest to shortest.

data()

Returns a reference to a string holding the contents input file, or returns a reference to the empty string.

file_report([%opt])

Here, the [] indicate an optional parameter.

Returns a hashref of statistics about the input file.

%opt may contain these (key => value) pairs:

o input_file => $string

The hashref returned has these (key => value) pairs:

o length => $name ? $length : 0

This is the length of the BOM in bytes.

o message => $name ? "BOM name $name found" : 'No BOM found'
o name => $name || ''

The name of the BOM.

The names are listed above, under "Constructor and Initialization".

o value => $value || 0

This is the value of the BOM.

input_file([$string])

Here, the [] indicate an optional parameter.

Gets or sets the name of the input file.

input_file is a parameter to "new([%opt])".

new([%opt])

Here, the [] indicate an optional parameter.

Returns an object of type File::BOM::Utils.

%opt may contain these (key => value) pairs:

o action => $string

The action wanted.

The actions are listed above, under "Constructor and Initialization".

o bom_name => $string

The name of the BOM.

The names are listed above, under "Constructor and Initialization".

o input_file => $string
o output_file => $string

output_file([$string])

Here, the [] indicate an optional parameter.

Gets or sets the name of the output file.

And yes, it can be the same as the input file, but does not default to the input file. That would be dangerous.

output_file is a parameter to "new([%opt])".

remove(%opt)

Here, the [] indicate an optional parameter.

Removes any BOM from the input file, and writes the result to the output_file.

%opt may contain these (key => value) pairs:

o input_file => $string
o output_file => $string

run(%opt)

Here, the [] indicate an optional parameter.

This is the only method users would normally call, but you can call directly any of the 3 methods mentioned next.

%opt is passed to "add([%opt]", "remove([%opt])" and "test([%opt])".

Returns 0.

%opt may contain these (key => value) pairs:

o action => $string

The action wanted.

The actions are listed above, under "Constructor and Initialization".

o bom_name => $string

The name of the BOM.

The names are listed above, under "Constructor and Initialization".

o input_file => $string
o output_file => $string

test([%opt])

Here, the [] indicate an optional parameter.

Print to STDOUT various statistics pertaining to the input file.

%opt may contain these (key => value) pairs:

o input_file => $string

FAQ

How does this module read and write files?

It uses File::Slurper's read_binary() and write_binary().

What are the hashes accessible from outside the module?

They are called %bom2name and %name2bom.

The BOM names used are listed under "Constructor and Initialization".

Which program is installed when the module is installed?

It is called bommer.pl. Run it with the -h option, to display help.

How is the parameter %opt, which may be passed to many methods, handled?

The keys in %opt are used to find values which are passed to the methods named after the keys.

For instance, if you call:

        my($bommer) = File::BOM::Utils -> new(action => 'add');

        $bommer -> run(action => 'test');

Then the code calls action('test'), which sets the 'current' value of action to test.

This means that if you later call action(), the value returned is whatever was the most recent value provided (to any method) in $opt{action}. Similarly for the other parameters to "new([%opt])".

Note: As syntactic sugar, you may specify just the 1st letter of the action. And that's why test is called test and not report.

What happens if I add the same BOM twice?

The program will do as you order it to do. Hopefully, you remove one or both of the BOMs immediately after testing the output.

See Also

String::BOM.

PPI::Token::BOM.

File::BOM.

XML::Tiny, whose test data I've adopted.

File::Slurper.

Machine-Readable Change Log

The file Changes was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Repository

https://github.com/ronsavage/File-BOM-Utils

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=File::BOM::Utils.

Author

File::BOM::Utils was written by Ron Savage <ron@savage.net.au> in 2015.

Marpa's homepage: http://savage.net.au/Marpa.html.

My homepage: http://savage.net.au/.

Copyright

Australian copyright (c) 2015, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License 2.0, a copy of which is available at:
        http://opensource.org/licenses/alphabetical.