The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

CGI::Uploader - Manage CGI uploads using SQL database

Synopsis

 my $u = CGI::Uploader->new(
        spec       => {
        # Upload one image named from the form field 'img' 
        # and create one thumbnail for it. 
        img => [
            { name => 'img_thumb_1', w => 100, h => 100 },
        ],
    }

        updir_url  => 'http://localhost/uploads',
        updir_path => '/home/user/www/uploads',

        dbh            => $dbh, 
        query      => $q, # defaults to CGI->new(),
 );

 # ... now do something with $u

Description

This module is designed to help with the task of managing files uploaded through a CGI application. The files are stored on the file system, and the file attributes stored in a SQL database.

Introduction and Recipes

The CGI::Uploader::Cookbook provides a slight more in depth introduction and recipes for a a basic BREAD web application. (Browse, Read, Edit, Add, Delete).

Constructor

new()

 my $u = CGI::Uploader->new(
        spec       => {
        img_1 => [
            # The first image has 2 different sized thumbnails 
            # that need to be created.
            { name => 'img_1_thumb_1', w => 100, h => 100 }, 
            { name => 'img_1_thumb_2', w => 50 , h => 50  }, 
            ],

        # No thumbnails
        img_2 => [],

        # Downsize the large image to these maximum dimensions if it's larger
        img_3 => {
            downsize => { w => 430 },
            thumbs => [
                { name => 'img_3_thumb_1',  w => 200 },
            ],

        }

                # Advanced:
                # Define a spec for all images matching a regular expression
                # and generate the thumbnail names based on each field name
                qr/^photos_/ => {
                        thumbs => [
                                { 
                                        name => sub { my $n = shift; "$n_thumb" },
                                        w    => 200,
                                },
                        ]
                }
    },

        updir_url  => 'http://localhost/uploads',
        updir_path => '/home/user/www/uploads',

        dbh            => $dbh, 
        query      => $q, # defaults to CGI->new(),

        up_table   => 'uploads', # defaults to "uploads"
        up_seq     => 'upload_id_seq',  # Required for Postgres
 );
spec [required]

The specification described the examples above. The keys correspond to form field names for upload fields. Keys may also be given as regular expressions, which will case keys to be added for any file upload field that matches.

The values are array references or hash references. The simplest case is an empty array reference, which means no thumbnails will be created.

Each element in the array is a hash reference with the following keys: 'name', 'w', 'h'. These correspond to the name, max width, and max height of the thumbnail.

name can be a simple scalar, or code ref. This code ref will be passed one argument, the file name, and should return a string to use for the thumbnail. This is useful in combination with specifying the keys as regular expressions.

Also notice there is an option to 'downsize' the large image if needed. Also, for the downsize and thumbnail size specifications, only one dimension needs to provided, if that's all you care about.

updir_url [required]

URL to upload storage directory. Should not include a trailing slash.

updir_path [required]

File system path to upload storage directory. Should not include a trailing slash.

dbh [required]

DBI database handle. Required.

query

A CGI.pm-compatible object, used for the param and upload functions. Defaults to CGI->new() if omitted.

up_table

Name of the SQL table where uploads are stored. See example syntax above or one of the creation scripts included in the distribution. Defaults to "uploads" if omitted.

up_table_map

A hash reference which defines a mapping between the column names used in your SQL table, and those that CGI::Uploader uses. The keys are the CGI::Uploader default names. Values are the names that are actually used in your table.

This is not required. It simply allows you to use custom column names.

  upload_id       => 'upload_id',
  mime_type       => 'mime_type',
  extension       => 'extension',
  width           => 'width',
  height          => 'height',   
  thumbnail_of_id => 'thumbnail_of_id',

You may also define additional column names with a value of 'undef'. This feature is only useful if you override the extract_meta() method or pass in $shared_meta to store_uploads(). Values for these additional columns will then be stored by store_meta() and retrieved with fk_meta().

up_seq

For Postgres only, the name of a sequence used to generate the upload_ids. Defaults to upload_id_seq if omitted.

file_scheme
 file_scheme => 'md5',

file_scheme controls how file files are stored on the file system. The default is simple, which stores all the files in the same directory with names like 123.jpg. Depending on your environment, this may be sufficient to store 10,000 or more files.

As an alternative, you can specify md5, which will create three levels of directories based on the first three letters of the ID's md5 sum. The result may look like this:

 2/0/2/123.jpg

This should scale well to millions of files. If you want even more control, consider overriding the build_loc() method.

Basic Methods

These basic methods are all you need to know to make effective use of this module.

store_uploads()

  my $entity = $u->store_uploads($form_data);

Stores uploaded files based on the definition given in spec.

Specifically, it does the following:

o

creates any needed thumbnails

o

stores all the files on the file system

o

inserts upload details into the database, including upload_id, mime_type and extension. The columns 'width' and 'height' will be populated if that meta data is available.

As input, a hash reference of form data is expected. The simplest way to get this is like this:

 use CGI;
 my $q = new CGI; 
 $form_data = $q->Vars;

However, I recommend that you validate your data with a module with Data::FormValidator, and use a hash reference of validated data, instead of directly using the CGI form data.

CGI::Uploader is designed to handle uploads that are included as a part of an add/edit form for an entity stored in a database. So, $form_data is expected to contain additional fields for this entity as well as the file upload fields.

For this reason, the store_uploads method returns a hash reference of the valid data with some transformations. File upload fields will be removed from the hash, and corresponding "_id" fields will be added.

So for a file upload field named 'img_field', the 'img_field' key will be removed from the hash and 'img_field_id' will be added, with the appropriate upload ID as the value.

store_uploads takes an optional second argument as well:

  my $entity = $u->store_uploads($form_data,$shared_meta);

This is a hash refeference of additional meta data that you want to store for all of the images you storing. For example, you may wish to store an "uploaded_user_id".

The keys should be column names that exist in your uploads table. The values should be appropriate data for the column. Only the key names defined by the up_table_map in new() will be used. Other values in the hash will be ignored.

delete_checked_uploads()

 my @fk_col_names = $u->delete_checked_uploads;

This method deletes all uploads and any associated thumbnails based on form input. Both files and meta data are removed.

It looks through all the field names defined in spec. For an upload named img_1, a field named img_1_delete is checked to see if it has a true value.

A list of the field names is returned, prepended with '_id', such as:

 img_1_id

The expectation is that you have foreign keys with these names defined in another table. Having the names is format allows you to easily set these fields to NULL in a database update:

 map { $entity->{$_} = undef } @fk_names;

fk_meta()

 my $href = $u->fk_meta($table,\%where,@prefixes);

Returns a hash reference of information about the file, useful for passing to a templating system. Here's an example of what the contents of $href might look like:

 {
     file_1_id     => 523,
     file_1_url    => 'http://localhost/images/uploads/523.pdf',
 }

If the files happen to be images and have their width and height defined in the database row, template variables will be made for these as well.

Here's an example syntax of calling the function:

 my $href = $u->fk_meta('news',{ item_id => 23 },qw/file_1/);

This is going to fetch the file information from the upload table for using the row where news.item_id = 23 AND news.file_1_id = uploads.upload_id.

This is going to fetch the file information from the upload table for using the row where news.item_id = 23 AND news.file_1_id = uploads.upload_id.

The %where hash mentioned here is a SQL::Abstract where clause. The complete SQL that used to fetch the data will be built like this:

 SELECT upload_id as id,width,height,extension 
    FROM uploads, $table 
    WHERE (upload_id = ${prefix}_id AND (%where_clause_expanded here));

Class Methods

These are some handy class methods that you can use without the need to first create an object using new().

upload()

 # As a class method
 ($tmp_filename,$uploaded_mt,$file_name) = 
        CGI::Uplooader->upload('file_field',$q);

 # As an object method
 ($tmp_filename,$uploaded_mt,$file_name) = 
        $u->upload('file_field');

The function is responsible for actually uploading the file.

It can be called as a class method or an object method. As a class method, it's necessary to provide a query object as the second argument. As an object method, the query object given the constructor is used.

Input: - file field name

Output: - temporary file name - Uploaded MIME Type - Name of uploaded file (The value of the file form field)

Currently CGI.pm, CGI::Simple and Apache::Request and are supported.

gen_thumb()

 ($thumb_tmp_filename)  = CGI::Uploader->gen_thumb(
    filename => $orig_filename,
    w => $width,
    h => $height,
    );

This function creates a copy of given image file and resizes the copy to the provided width and height.

gen_thumb can be called as object or class method. As a class method, there there is no need to call new() before calling this method.

Input: filename => filename of source image w => max width of thumbnail h => max height of thumbnail

One or both of w or h is required.

Output: - filename of generated tmp file for the thumbnail

Upload Methods

These methods are high level methods to manage the file and meta data parts of an upload, as well it's thumbnails. If you are doing something more complex or customized you may want to call or overide one of the below methods.

store_upload()

 my %entity_upload_extra = $u->store_upload(
    file_field    => $file_field,
    src_file      => $tmp_filename,
    uploaded_mt   => $uploaded_mt,
    file_name     => $file_name,
    shared_meta   => $shared_meta,  # optional
    id_to_update  => $id_to_update, # optional
 );

Does all the processing for a single upload, after it has been uploaded to a temp file already.

It returns a hash of key/value pairs as described in "store_uploads()".

create_store_thumbs()

 my %thumb_ids = $u->create_store_thumbs(
                file_field      => $file_field,
                meta            => $meta_href,
                src_file        => $tmp_filename,
                thumbnail_of_id => $thumbnail_of_id,
    );

This method is responsible for creating and storing any needed thumnbnails.

Input: - file_field: file field name - meta: a hash ref of meta data, as extract_meta would produce - src_file: path to temporary file of the file upload - thumbnail_of_id: ID of upload that thumbnails will be made from

delete_upload()

  $u->delete_upload($upload_id);

This method is used to delete the meta data and file associated with an upload. Usually it's more convenient to use delete_checked_uploads than to call this method directly.

This method does not delete thumbnails for this upload.

delete_thumbs()

 $self->delete_thumbs($id);

Delete the thumbnails for a given file ID, from the file system and the database

Meta-data Methods

extract_meta()

 $self->extract_meta($file_field)

This method extracts and returns the meta data about a file and returns it.

Input: A file field name.

Returns: a hash reference of meta data, following this example:

 {
         mime_type => 'image/gif',
         extension => '.gif',
         bytes     => 60234,
 
         # only for images
         width     => 50,
         height    => 50,
 }

store_meta()

 my $id = $self->store_meta($file_field,$meta);  

This function is used to store the meta data of a file upload.

Input: - file field name - A hashref of key/value pairs to be store. Only the key names defined by the up_table_map in new() will be used. Other values in the hash will be ignored. - Optionally, an upload ID can be passed, causing an 'Update' to happen instead of an 'Insert'

Output: - The id of the file stored. The id is generated by store_meta().

delete_meta()

 my $dbi_rv = $self->delete_meta($id);

Deletes the meta data for a file and returns the DBI return value for this operation.

transform_meta()

 my %meta_to_display = $u->transform_meta(
                meta   => $meta_from_db,
                prefix => 'my_field',
                prevent_browser_caching => 0,
                fields => [qw/id url width height],
        );

Prepares meta data from the database for display.

Input: - meta: A hashref, as might be returned from "SELECT * FROM uploads"

 - prefix: the resulting hashref keys will be prefixed with this,
   adding an underscore as well.

 - prevent_browse_caching: If set to true, a random query string  
   will be added, preventing browsings from caching the image. This is very
   useful when displaying an image an 'update' page. Defaults to true.

 - fields: An arrayef of fields to format. The values here must be
   keys in the C<up_table_map>. Two field names are special. 'C<id> is
   used to denote the upload_id. C<url> combines several fields into
   a URL to link to the upload. 

Output: - A formatted hash.

See "fk_meta()" for example output.

File Methods

store_file()

 $self->store_file($file_field,$tmp_file,$id,$ext);

Stores an upload file or dies if there is an error.

Input: - file field name - path to tmp file for uploaded image - file id, as generated by store_meta() - file extension, as discovered by extract_meta

Output: none

delete_file()

 $self->delete_file($id);

Call from within delete_upload, this routine deletes the actual file. Dont' delete the the meta data first, you may need it build the path name of the file to delete.

Utility Methods

build_loc()

 my $up_loc = $self->build_loc($id,$ext);

Builds a path to access a single upload, relative to updir_path. This is used to both file-system and URL access. Also see the file_scheme option to new(), which affects it's behavior.

upload_field_names()

 # As a class method
 (@file_field_names) = CGI::Uplooader->upload_field_names($q);

 # As an object method
 (@file_field_names) = $u->upload_field_names();

Returns the names of all form fields which contain file uploads. Empty file upload fields may be excluded.

This can be useful for auto-generating a spec.

Input: - A query object is required as input only when called as a class method.

Output: - an array of the file upload field names.

spec_names()

 $spec_names = $u->spec_names('file_field'):

With no arguments, returns an array of all the upload names defined in the spec, including any thumbnail names.

With one argument, a file field from the spec, can also be provided. It then returns that name as well as the names of any related thumbnails.

Author

Mark Stosberg <mark@summersault.com>

Thanks

A special thanks to David Manura for his detailed and persistent feedback in the early days, when the documentation was wild and rough.

Barbie, for the first patch.

License

This program is free software; you can redistribute it and/or modify it under the terms as Perl itself.