The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

App::Fetchware::Util - Miscelaneous functions for App::Fetchware.

VERSION

version 1.008

SYNOPSIS

    use App::Fetchware::Util ':UTIL';


    # Logging subroutines.
    msg 'message to print to STDOUT';

    vmsg 'message to print to STDOUT';


    # Run external command subroutine.
    run_prog($program, @args);


    # Download subroutines.
    my $dir_list = download_dirlist($ftp_or_http_url)

    my $dir_list = ftp_download_dirlist($ftp_url);

    my $dir_list = http_download_dirlist($http_url);


    my $filename = download_file($url)

    my $filename = download_ftp_url($url);

    my $filename = download_http_url($url);

    my $filename = download_file_url($url);


    # Miscelaneous subroutines.
    just_filename()

    do_nothing();


    # Temporary directory subroutines.
    my $temp_dir = create_tempdir();

    my $original_cwd = original_cwd();

    cleanup_tempdir();

DESCRIPTION

App::Fetchware::Util holds miscelaneous utilities that fetchware needs for various purposes such as logging and controling executed processes based on -q or -v switches (msg(), vmsg(), run_prog()), subroutines for downloading directory listings (*_dirlist()) or files (download_*()) using ftp, http, or local files (file://), do_nothing() for extensions to fetchware, and subroutines for managing a temporary directory.

LOGGING SUBROUTINES

These subroutines' log messages generated by fetchware by printing them to STDOUT. They do not currently support logging to a file directly, but you could redirect fetchware's standard output to a file using your shell if you want to:

    fetchware <some fetchware command> any arguments > fetchware.log
    fetchware upgrade-all > fetchware.log

Standards for using msg() and vmsg()

msg() should be used to describe the main events that happen, while vmsg() should be used to describe what all of the main subroutine calls do.

For example, cmd_uninstall() has a msg() at the beginning and at the end, and so do the main App::Fetchware subroutines that it uses such as start(), download(), unarchive(), end() and so on. They both use vmsg() to add more detailed messages about the particular even "internal" things they do.

msg() and vmsg() are also used without parens due to their appropriate prototypes. This makes them stand out from regular old subroutine calls more.

msg()

    msg 'message to print to STDOUT' ;
    msg('message to print to STDOUT');

msg() simply takes a list of scalars, and it prints them to STDOUT according to any verbose (-v), or quiet (-q) options that the user may have provided to fetchware.

msg() will still print its arguments if the user provided a -v (verbose) argument, but it will not print its argument if the user provided a -q (quiet) command line option.

vmsg()

    vmsg 'message to print to STDOUT' ;
    vmsg('message to print to STDOUT');

vmsg() simply takes a list of scalars, and it prints them to STDOUT according to any verbose (-v), or quiet (-q) options that the user may have provided to fetchware.

vmsg() will only print its arguments if the user provided a -v (verbose) argument, but it will not print its argument if the user provided a -q (quiet) command line option.

EXTERNAL COMMAND SUBROUTINES

run_prog() should be the only function you use to execute external commands when you "extend your Fetchwarefile", or "write a fetchware extension", because run_prog() properly checks if the user specified the quiet switch (-q), and disables external commands from printing to STDOUT if it has been enabled.

run_prog()

    run_prog($program, @args);

    # Or let run_prog() deal with splitting the $command into multiple pieces.
    run_prig($command);

run_prog() uses system to execute the program for you. Only the secure way of avoiding the shell is used, so you can not use any shell redirection or any shell builtins.

If the user ran fetchware with -v (verbose) then run_prog() changes none of its behavior it still just executes the program. However, if the user runs the program with -q (quiet) specified, then the the command is run using a piped open to capture the output of the program. This captured output is then ignored, because the user asked to never be bothered with the output. This piped open uses the safer shell avoiding syntax on systems with fork, and systems without fork, Windows, the older less safe syntax is used. Backticks are avoided, because they always use the shell.

run_prog() when called with only one argument will split that one argument into multiple pieces using Text::ParseWords quotewords() subroutine, which properly deals with quotes just like the shell does. quotewords() is always used even if you provide an already split up list of arguments to run_prog().

Executing external commands without using run_prog()

Subify the -q checking code, and paste it below, and tell users to use that if they want to use something else, and document the $fetchware::quiet variable for other users too.

msg(), vmsg(), and run_prog() determine if -v and if -q were specified by checking the values of the global variables listed below:

  • $fetchware::quiet - is 0 if -q was not specified. =item * $fetchware::verbose - is 0 if -v was not specified.

Both of these variables work the same way. If they are 0, then -q or -v was not specified. And if they are defined and greather than (>) 0, then -q or -v were specified on the command line. You should test for greater than 0 not == 1, because Fetchware takes advantage of a cool feature in GetOpt::Long allowing the user to specify -v and -q more than once. This triggers either $fetchware::quiet or $fetchware::verbose to be greater than one, which would cause a direct == 1 test to fail even though the user is no asking for more verbose messages. Internally Fetchware only supports on verbositly level.

DOWNLOAD SUBROUTINES

App::Fetchware::Util's download_*() and *_dirlist() subroutines allow you to download FTP, HTTP, or local file (file://) directory listings or files respectively.

download_dirlist()

    my $dir_list = download_dirlist($url)

    my $dir_list = download_dirlist(PATH => $path)

Can be called with either a $url or a PATH parameter. When called with a $url parameter, the specified $url is downloaded using no_mirror_download_dirlist(), and returned if successful. If it fails then each mirror the user specified is also tried unitl there are no more mirrors, and then an exception is thrown.

If you specify a PATH parameter instead of a $url parameter, then that path is appended to each mirror, and the resultant url is downloaded using no_mirror_download_dirlist().

no_mirror_download_dirlist()

    my $dir_list = no_mirror_download_dirlist($ftp_or_http_url)

Downloads a ftp or http url and assumes that it will be downloading a directory listing instead of an actual file. To download an actual file use download_file(). download_dirlist returns the directory listing that it obtained from the ftp or http server. ftp server will be an arrayref of ls -l like output, while the http output will be a scalar of the HTML dirlisting provided by the http server.

ftp_download_dirlist()

    my $dir_list = ftp_download_dirlist($ftp_url);

Uses Net::Ftp's dir() method to obtain a long directory listing. lookup() needs it in long format, so that the timestamp algorithm has access to each file's timestamp.

Returns an array ref of the directory listing.

http_download_dirlist()

    my $dir_list = http_download_dirlist($http_url);

Uses HTTP::Tiny to download a HTML directory listing from a HTTP Web server.

Returns an scalar of the HTML ladden directory listing.

If an even number of other options are specified (a faux hash), then those options are forwarded on to HTTP::Tiny's new() method. See HTTP::Tiny for details about what these options are. For example, you couse use this to add a Referrer header to your request if a download site annoying checks referrers.

file_download_dirlist()

    my $file_listing = file_download_dirlist($local_lookup_url)

Glob's provided $local_lookup_url, and builds a directory listing of all files in the provided directory. Then list_file_dirlist() returns a list of all of the files in the current directory.

download_file()

    my $filename = download_file($url)

    my $filename = download_file(PATH => $path)

Can be called with either a $url or a PATH parameter. When called with a $url parameter, the specified $url is downloaded using no_mirror_download_file(), and returned if successful. If it fails then each mirror the user specified is also tried unitl there are no more mirrors, and then an exception is thrown.

If you specify a PATH parameter instead of a $url parameter, then that path is appended to each mirror, and the resultant url is downloaded using no_mirror_download_file().

no_mirror_download_file()

    my $filename = no_mirror_download_file($url)

Downloads one $url and assumes it is a file that will be downloaded instead of a file listing that will be returned. no_mirror_download_file() returns the file name of the file it downloads.

Like its name says it does not try any configured mirrors at all. This subroutine should not be used; instead download_file() should be used, because you should respect your user's desired mirrors.

download_ftp_url()

    my $filename = download_ftp_url($url);

Uses Net::FTP to download the specified FTP URL using binary mode.

download_http_url()

    my $filename = download_http_url($url);

Uses HTTP::Tiny to download the specified HTTP URL.

Supports adding extra arguments to HTTP::Tiny's new() constructor. These arguments are not checked for correctness; instead, they are simply forwarded to HTTP::Tiny, which does not check them for correctness either. HTTP::Tiny simply loops over its internal listing of what is arguments should be, and then accesses the arguments if they exist.

This was really only implemented to allow App::FetchwareX::HTMLPageSync to change its user agent string to avoid being blocked or freaking out Web developers that they're being screen scraped by some obnoxious bot as HTMLPageSync is wimpy and harmless, and only downloads one page.

You would add an argument like this: download_http_url($http_url, agent => 'Firefox');

See HTTP::Tiny's documentation for what these options are.

download_file_url()

    my $filename = download_file_url($url);

Uses File::Copy to copy ("download") the local file to the current working directory.

TEMPDIR SUBROUTINES

These subroutines manage the creation of a temporary directory for you. They also implement the original_cwd() getter subroutine that returns the current working directory fetchware was at before create_tempdir() chdir()'d to the temporary directory you specify. File::Temp's tempdir() is used, and cleanup_tempdir() manages the fetchware.sem fetchware semaphore file.

create_tempdir()

    my $temp_dir = create_tempdir();

Creates a temporary directory, chmod 700's it, and chdir()'s into it.

Accepts the fake hash argument KeepTempDir = 1>, which tells create_tempdir() to not delete the temporary directory when the program exits.

Also, accepts TempDir => '/tmp' to specify what temporary directory to use. The default with out this argument is to use tempdir()'s default, which is whatever File::Spec's tmpdir() says to use.

The NoChown => 1 option causes create_tempdir() to not chown to config('user').

Locking Fetchware's temp directories with a semaphore file.

In order to support fetchware clean, create_tempdir() creates a semaphore file. The file is used by fetchware clean (via bin/fetchware's cmd_clean()) to determine if another fetchware process out there is currently using this temporary directory, and if it is not, the file is not currently locked with flock, then the entire directory is deleted using File::Path's remove_path() function. If the file is there and locked, then the directory is skipped by cmd_clean(). Note: you can call fetchware clean with the -f or --force option to force fetchware to delete all fetchware temporary directories even out from under the pants of any currently running fetchware process!

cleanup_tempdir() is responsible for unlocking the semaphore file that create_tempdir() creates. However, the coolest part of using flock is that if fetchware is killed in any manner whether its END block or File::Temp's ENDblock run, the OS will still unlock the file, so no edge cases need handling, because the OS will do them for us!

original_cwd()

    my $original_cwd = original_cwd();

original_cwd() simply returns the value of fetchware's $original_cwd that is saved inside each create_tempdir() call. A new call to create_tempdir() will reset this value. Note: App::Fetchware's start() also calls create_tempdir(), so another call to start() will also reset original_cwd().

cleanup_tempdir()

    cleanup_tempdir();

Cleans up any temporary files or directories that anything in this process used File::Temp to create. You cannot only clean up one directory or another; instead, you must just use this sparingly or in an END block although file::Temp takes care of that for you unless you asked it not to.

It also closes $fh_sem, which is the filehandle of the 'fetchware.sem' file create_tempdir() opens and locks. By closing it in cleanup_tempdir(), we're unlocking it. According to MJD's "File Locking Tips and Traps," it's better to just close the file, then use flock to unlock it.

SECURITY SUBROUTINES

This section describes Utilty subroutines that can be used for checking security of files on the file system to see if fetchware should open and use them.

safe_open()

    my $fh = safe_open($file_to_check, <<EOE);
    App-Fetchware-Extension???: Failed to open file [$file_to_check]! Because of
    OS error [$!].
    EOE

    # To open for writing instead of reading 
    my $fh = safe_open($file_to_check, <<EOE, MODE => '>');
    App-Fetchware-Extension???: Failed to open file [$file_to_check]! Because of
    OS error [$!].
    EOE

safe_open() takes $file_to_check and does a bunch of file checks on that file to determine if it's safe to open and use the contents of that file in your program. Instead of returning true or false, it returns a file handle of the file you want to check that has already been open for you. This is done to prevent race conditions between the time safe_open() checks the file's safety and the time the caller actually opens the file.

safe_open() also takes an optional second argument that specifies a caller specific error message that replaces the generic default one.

Fetchware occasionally needs to write files especially in fetchware's new() command; therefore safe_open() also takes the fake hash argument MODE => '>', which opens the file in a mode specified by the caller. '>' is for writing for example. See perldoc -f open for a list of possible modes.

In fetchware, this subroutine is used to check if every file fetchware opens is safe to do so. It is based on is_safe() and is_very_safe() from the Perl Cookbook by Tom Christiansen and Nathan Torkington.

What this subroutine checks:

  • It opens the file you give to it as an argument, and all subsequent operations are done on the opened filehandle to prevent race conditions.

  • Then it checks that the owner of the specified file must be either the superuser or the user who ran fetchware.

  • It checks that the mode, as returned by File::stat's overridden stat, is not writable by group or other. Fancy MAC permissions such as Linux's extfs's extensions and fancy Windows permissions are not currently checked.

  • Then safe_open() stat's each and every parent directory that is in this file's full path, and runs the same checks that are run above on each parent directory.

  • _PC_CHOWN_RESTRICTED is not tested; instead what is_very_safe() does is simply always done. Because even with A _PC_CHOWN_RESTRICTED test, /home, for example, could be 777. This is Unix after all, and root can do anything including screw up permissions on system directories.

If you actually are some sort of security expert, please feel free to double-check if the list of stuff to check for is complete, and perhaps even the Perl implementation to see if the subroutien really does check if safe_open($file_to_check) is actually safe.

WARNING

According to perlport's chmod() documentation, on Win32 perl's Unixish file permissions arn't supported only "owner" is:

"Only good for changing "owner" read-write access, "group", and "other" bits are meaningless. (Win32)"

I'm not completely sure this means that under Win32 only owner perms mean something, or if just chmod()ing group or ther bits don't do anything, but testing if group and other are rwx does work. This needs testing.

And remember this only applies to Win32, and fetchware has not yet been properly ported or tested under Win32 yet.

drop_privs()

    my $output = drop_privs(sub {
        my $write_pipe = shift;
        # Do stuff as $regular_user
        ...
        # Use write_dropprivs_pipe to share variables back to parent.
        write_dropprivs_pipe($write_pipe, $var1, $var2, ...);

        }, $regular_user
    );

    # Back in the parent, use read_dropprivs_pipe() to read in whatever
    # variables the child shared with us.
    my ($var1, $var2, ...) = read_dropprivs_pipe($output);

Forks and drops privs to $regular_user, and then executes whatever is in the first argument, which should be a code reference. Throws an exception on any problems with the fork.

It only allows you to specify what the lower priveledged user does. The parent process's behavior can not be changed. All the parent does:

  • Create a pipe to allow the child to communicate any information back to the parent.

  • Read any data the child may write to that pipe.

  • After the child has died, collect the child's exit status.

  • And return the output the child wrote on the pipe as a scalar reference.

Whatever the child writes is returned. drop_privs() does not use Storable or JSON or XML or anything. It is up to you to specify how the data is to be represented and used. However, read_dropprivs_pipe() and write_dropprivs_pipe() are provided. They provide a simple way to store multiple variables that can have any character in them including newline. See their documentation for details.

SECURITY NOTICE

The output returned by drop_privs() is whatever the child wants it to be. If somehow the child got hacked, the $output could be something that could cause the parent (which has root perms!) to execute some code, or otherwise do something that could cause the child to gain root access. So be sure to check how you use drop_privs() return value, and definitley don't just string eval it. Structure it so the return value can only be used as data for variables, and that those variables are never executed by root.

drop_privs() handles being on nonunix for you. On a platform that is not Unix that does not have Unix's fork() and exec() security model, drop_privs() simply executes the provided code reference without dropping priveledges.

USABILITY NOTICE

drop_privs()'s implementation depends on start() creating a tempdir and chdir()ing to it. Furthermore, drop_privs() sometimes creates a tempdir of its own, and it does not do a chdir back to another directory, so drop_privs() depends on end() to chdir back to original_cwd(). Therefore, do not use drop_privs() without also using start() and end() to manage a temporary directory for drop_privs().

drop_privs() also supports a SkipTempDirCreation => 1 option that turns off drop_privs() creating a temporary diretory to give the child a writable temporary directory. This option is only used by cmd_new(), and probably only really needs to be used there. Also, note that you must provide this option after the $child_code coderef, and the $regular user options. Like so, my $output = drop_privs($child_code, $regular_user, SkipTempDirCreation => 1.

drop_privs() PIPE PARSING UTILITIES

drop_privs() uses a pipe for IPC between the child and the parent. This section contains utilties that help users of drop_privs() parse the input and output they send from the child back to the parent.

Use write_dropprivs_pipe() to send data back to the parent, that later you'll read with read_dropprivs_pipe() back in the parent.

write_dropprivs_pipe()

    write_dropprivs_pipe($write_pipe, $variable1, $variable2, $variable3);

Simply uses the caller provided $write_pipe file handle to write the rest of its args to that file handle separated by a magic number.

This magic number is just generated uniquely each time App::Fetchware::Util is compiled. This number replaces using newline to separate each of the variables that write_dropprivs_pipe() writes. This way you can include newline, and in fact anything that does not contain the magic number, which is obviously suitably unlikely.

read_dropprivs_pipe()

    my ($variable1, $variable2, $variable3) = pipe_read_newling($output);

read_dropprivs_pipe() opens the scalar $output, and returns a list of $outputs parsed out variables split on the $MAGIC_NUMBER, which is randomly generated during each time you run Fetchware to avoid you every actually using it.

MISCELANEOUS UTILTY SUBROUTINES

This is just a catch all category for everything else in App::Fetchware::Utility.

do_nothing()

    do_nothing();

do_nothing() does nothing but return. It simply returns doing nothing. It is meant to be used by App::Fetchware "subclasses" that "override" App::Fetchware's API subroutines to make those API subroutines do nothing.

ERRORS

As with the rest of App::Fetchware, App::Fetchware::Util does not return any error codes; instead, all errors are die()'d if it's App::Fetchware::Util's error, or croak()'d if its the caller's fault. These exceptions are simple strings, and are listed in the "DIAGNOSTICS" section below.

BUGS

App::Fetchware::Util's temporary directory creation utilities, create_tempdir(), original_cwd(), and cleanup_tempdir(), only keep track of one tempdir at a time. If you create another tempdir with create_tempdir() it will override the value of original_cwd(), which may mess up other functions that call create_tempdir(), original_cwd(), and cleanup_tempdir(). Therefore, be careful when you call these functions, and do not use them inside a fetchware extension if you reuse App::Fetchware's start() and end(), because App::Fetchware's start() and end() use these functions, so your use of them will conflict. If you still need to create a tempdir just call File::Temp's tempdir() directly.

AUTHOR

David Yingling <deeelwy@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by David Yingling.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.