The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTTP::Promise::Entity - HTTP Entity Class

SYNOPSIS

    use HTTP::Promise::Entity;
    my $this = HTTP::Promise::Entity->new || die( HTTP::Promise::Entity->error, "\n" );

VERSION

    v0.2.1

DESCRIPTION

This class represents an HTTP entity, which is an object class containing an headers object and a body object. It is agnostic to the type of HTTP message (request or response) it is associated with and can be used recurrently, such as to represent a part in a multipart HTTP message. Its purpose is to provide an API to access and manipulate and HTTP message entity.

Here is how it fits in overall relation with other classes.

    +-------------------------+    +--------------------------+    
    |                         |    |                          |    
    | HTTP::Promise::Request  |    | HTTP::Promise::Response  |    
    |                         |    |                          |    
    +------------|------------+    +-------------|------------+    
                 |                               |                 
                 |                               |                 
                 |                               |                 
                 |  +------------------------+   |                 
                 |  |                        |   |                 
                 +--- HTTP::Promise::Message |---+                 
                    |                        |                     
                    +------------|-----------+                     
                                 |                                 
                                 |                                 
                    +------------|-----------+                     
                    |                        |                     
                    | HTTP::Promise::Entity  |                     
                    |                        |                     
                    +------------|-----------+                     
                                 |                                 
                                 |                                 
                    +------------|-----------+                     
                    |                        |                     
                    | HTTP::Promise::Body    |                     
                    |                        |                     
                    +------------------------+                     

CONSTRUCTOR

new

This instantiate a new HTTP::Promise::Entity object and returns it. It takes the following options, which can also be set or retrieved with their related method.

  • compression_min

    Integer. Size threshold beyond which the associated body can be compressed. This defaults to 204800 (200Kb). Set it to 0 to disable it.

  • effective_type

    String. The effective mime-type. Default to undef

  • epilogue

    An array reference of strings to be added after the headers and before the parts in a multipart message. Each array reference entry is treated as one line. This defaults to undef

  • ext_vary

    Boolean. Setting this to a true value and this will have "decode_body" and "encode_body" change the entity body file extension to reflect the encoding or decoding applied.

    See "ext_vary" for an example.

  • headers

    This is an HTTP::Promise::Headers object. This defaults to undef

  • is_encoded

    Boolean. This is a flag used to determine whether the related entity body is decoded or not. This defaults to undef

    See also "content_encoding" in HTTP::Promise::Headers

  • output_dir

    This is the path to the directory used when extracting body to files on the filesystem. This defaults to undef

  • preamble

    An array reference of strings to be added after all the parts in a multipart message. Each array reference entry is treated as one line. This defaults to undef

METHODS

add_part

Provided with an HTTP::Promise::Entity object, and this will add it to the stack of parts for this entity.

It returns the part added, or upon error, sets an error and returns undef.

as_form_data

If the entity is of type multipart/form-data, this will transform all of its parts into an HTTP::Promise::Body::Form::Data object.

It returns the new HTTP::Promise::Body::Form::Data object upon success, or 0 if there was nothing to be done i the entity is not multipart/form-data for example, or upon error, sets an error and returns undef.

Note that this is memory savvy, because even though it breaks down the parts into an HTTP::Promise::Body::Form::Data object, original entity body that were stored on file remain on file. Each of the HTTP::Promise::Body::Form::Data entry is a field name and its value is an HTTP::Promise::Body::Form::Field object. Thus you could access data such as:

    my $form = $ent->as_form_data;
    my $name = $form->{fullname}->value;
    if( $form->{picture}->file )
    {
        say "Picture is stored on file.";
    }
    elsif( $form->{picture}->value->length )
    {
        say "Picture is in memory.";
    }
    else
    {
        say "There is no data.";
    }

    say "Content-Type for this form-data is: ", $form->{picture}->headers->content_type;

as_string

This returns a scalar object containing a string representation of the message entity.

It takes an optional string parameter containing an end of line separator, which defaults to \015\012.

Internally, this calls "print".

If an error occurred, it set an error and returns undef.

Be mindful that because this returns a scalar object, it means the entire HTTP message entity is loaded into memory, which, depending on the content size, can potentially be big and thus take a lot of memory.

You may want to check the body size first using: $ent-body->length> for example if this is not a multipart entity.

attach

Provided with a list of parameters and this add the created part entity to the stack of entity parts.

This will transform the current entity into a multipart, if necessary, by calling "make_multipart"

Since it calls "build" internally to build the message entity, see "build" for the list of supported parameters.

It returns the newly added part object upon success, or upon error, sets an error and returns undef.

body

Sets or gets this entity body object.

body_as_array

This returns an array object object containing body lines with each line terminated by an end-of-line sequence, which is optional and defaults to \015\012.

Upon error, sets an error and returns undef.

body_as_string

This returns a scalar object containing a string representation of the message body.

build

    my $ent = HTTP::Promise::Entity->new(
        encoding => 'gzip',
        type     => 'text/plain',
        data     => 'Hello world',
    );
    my $ent = HTTP::Promise::Entity->new(
        encoding => 'guess',
        type     => 'text/plain',
        data     => '/some/where/file.txt',
    );

This takes an hash or hash reference of parameters and build a new HTTP::Promise::Entity.

It returns the newly created entity object object upon success, or upon error, sets an error and returns undef.

Supported arguments are:

  • boundary

    The part boundary to be used if the entity is of type multipart.

  • data

    The entity body content. If this is provided, the entity body will be an HTTP::Promise::Body::Scalar object.

  • debug

    An integer representing the level of debugging output. Defaults to 0.

  • disposition

    A string representing the Content-Disposition, such as form-data. This defaults to inline.

  • encoding

    String. A comma-separated list of content encodings used in order you want the entity body to be encoded.

    For example: gzip, base64 or brotli

    See HTTP::Promise::Stream for a list of supported encodings.

    If encoding is guess, this will call "suggest_encoding" to find a suitable encoding, if any at all.

  • filename

    The filename attribute value of a Content-Disposition header value, if any.

    If the filename provided contains 8 bit characters like unicode characters, this will be detected and the filename will be encoded according to rfc2231

    See also "content_disposition" in HTTP::Promise::Headers and HTTP::Promise::Headers::ContentDisposition

  • path

    The filepath to the content to be used as the entity body. This is useful if the body size is big and you do not want to load it in memory.

  • type

    String. The entity mime-type. This defaults to text/plain

    If the type is set to multipart/form-data or multipart/mixed, or any other multipart type, this will automatically create a boundary, which is basically a UUID generated with the XS module Data::UUID

compression_min

Integer. This is the body size threshold in bytes beyond which this will make the encoding of the entity body possible. You can set this to zero to deactivate it.

content_charset

This will try to guess the character set of the body and returns a string the character encoding found, if any, or upon error, sets an error and returns undef. If nothing was found, it will return an empty string.

It takes an optional hash or hash reference of options.

Supported options are;

  • content

    A string or scalar reference of some or all of the body data to be checked. If this is not provided, 4Kb of data will be read from the body to guess the character encoding.

decode_body

This takes a coma-separated list of encoding or an array reference of encodings, and an optional hash or hash reference of options and decodes the entity body.

It returns the body object upon success, and upon error, sets an error and returns undef.

Supported options are:

  • raise_error

    Boolean. When set to true, this will cause this method to die upon error.

  • replace

    Boolean. If true, this will replace the body content with the decoded version. Defaults to true.

What this method does is instantiate a new HTTP::Promise::Stream object for each encoding and pass it the data whether as a scalar reference if the data are in-memory body, or a file, until all decoding have been applied.

When deflate is one of the encoding, it will try to use IO::Uncompress::Inflate to decompress data. However, some server encode data with deflate but omit the zlib headers, which makes IO::Uncompress::Inflate fail. This is detected and trapped and rawdeflate is used as a fallback.

dump

This dumps the entity data into a string and returns it. It will encode the body if not yet encoded and will escape control and space characters, and show in hexadecimal representation the body content, so that even binary data is safe to dump.

It takes some optional arguments, which are:

  • maxlength

    Max body length to include in the dump.

  • no_content

    The string to use when there is no content, i.e. when the body is empty.

dump_skeleton

This method is more for debugging, or to get a peek at the entity structure. This takes a filehandle to print the result to.

This returns the current entity object on success, and upon error, sets an error and returns undef.

effective_type

This set or get the effective mime-type. In assignment mode, this simply stores whatever mie-type you provide and in retrieval mode, this retrieve the value previously set, or by default the value returned from "mime_type"

encode_body

This encode the entity body according to the encodings provided either as a comma-separated string or an array reference of encodings.

The way it does this is to instantiate a new HTTP::Promise::Stream object for each encoding and pass it the latest entity body.

The resulting encoded body replaces the original one.

It returns the entity body upon success, and upon error, sets an error and returns undef.

epilogue

Sets or gets an array of epilogue lines. An epilogue is lines of text added after the last part of a multipart message.

This returns an array object

ext_vary

Boolean. Setting this to a true value and this will have "decode_body" and "encode_body" change the entity body file extension to reflect the encoding or decoding applied.

For example, if the entity body is stored in a text file /tmp/DDAB03F0-F530-11EC-8067-D968FDB3E034.txt, applying "encode_body" with gzip will create a new body text file such as /tmp/DE13000E-F530-11EC-8067-D968FDB3E034.txt.gz

guess_character_encoding

This will try to guess the entity body character encoding.

It returns the encoding found as a string, if any otherwise it returns an empty string (not undef), and upon error, sets an error and returns undef.

This method tries to guess variation of unicode character sets, such as UTF-16BE, UTF-16LE, and utf-8-strict

It takes some optional parameters:

  • content

    A string or scalar reference of content data to perform the guessing against.

    If this is not provided, this method will read up to 4096 bytes of data from the body to perform the guessing.

See also "content_charset"

Set or get the value returned by calling "header" in HTTP::Promise::Headers

This is just a shortcut.

headers

Sets or get the entity headers object

header_as_string

Returns the entity headers as a string.

http_message

Sets or get the HTTP message object

io_encoding

This tries hard to find out the character set of the entity body to be used with "open" in perlfunc or "binmode" in perlfunc

It returns a string, possibly empty if nothing could be guessed, and upon error, sets an error and returns undef.

It takes the following optional parameters:

  • alt_charset

    Alternative character set to be used if none other could be found nor worked.

  • body

    The entity body object to use.

  • charset

    A string containing the charset you think is used and this will perform checks against it.

  • charset_strict

    Boolean. If true, this will enable the guessing in more strict mode (using the FB_CROAK flag on Encode)

  • content

    A string or a scalar reference of content data to the guessing against.

  • default_charset

    The default charset to use when nothing else was found.

is_binary

This checks if the data provided, or by default this entity body is binary data or not.

It returns true (1) if it is, and false (0) otherwise. It returns false if the data is empty.

This performs the similar checks that perl does (see "-T" in perlfunc

It sets and error and return undef upon error

You can optionally provide some data either as a string or as a scalar reference.

See also "is_text"

For example:

    my $bool = $ent->is_binary;
    my $bool = $ent->is_binary( $string_of_data );
    my $bool = $ent->is_binary( \$string_of_data );

is_body_in_memory

Returns true if the entity body is an HTTP::Promise::Body::Scalar object, false otherwise.

is_body_on_file

Returns true if the entity body is an HTTP::Promise::Body::File object, false otherwise.

is_decoded

Boolean. Set get the decoded status of the entity body.

is_encoded

Boolean. Set get the encoded status of the entity body.

is_multipart

Returns true if this entity is a multipart message or not.

is_text

This checks if the data provided, or by default this entity body is text data or not.

It returns true (1) if it is, and false (0) otherwise. It returns true if the data is empty.

It sets and error and return undef upon error

You can optionally provide some data either as a string or as a scalar reference.

See also "is_binary"

For example:

    my $bool = $ent->is_text;
    my $bool = $ent->is_text( $string_of_data );
    my $bool = $ent->is_text( \$string_of_data );

make_boundary

Returns a uniquely generated multipart boundary created using Data::UUID

make_multipart

This transforms the current entity into the first part of a <multipart/form-data> HTTP message.

For HTTP request, multipart/form-data is the only valid Content-Type for sending multiple data. rfc7578 in section 4.3 states: "[RFC2388] suggested that multiple files for a single form field be transmitted using a nested "multipart/mixed" part. This usage is deprecated."

See also this Stackoverflow discussion and this one too

Of course, technically, nothing prevents an HTTP message (request or response) from being a multipart/mixed or something else.

This method takes a multipart subtype, such as form-data, or mixed, etc and creates a multipart entity of which this current entity will become the first part. If no multipart subtype is specified, this defaults to form-data.

It takes also an optional hash or hash reference of parameters.

Valid parameters are:

  • force

    Boolean. Forces the creation of a multipart even when the current entity is already a multipart.

    This would have the effect of having the current entity become an embedded multipart into a new multipart entity.

It returns the current entity object, modified, upon success, or upon error, sets an error and returns undef.

make_singlepart

This transform the current entity into a simple, i.e. no multipart, message entity.

It returns false, but not undef if this contains more than one part. It returns the current object upon success, or if this is already a simple entity message, or upon error, sets an error and returns undef.

mime_type

Returns this entity mime-type by calling "mime_type" in HTTP::Promise::Headers and passing it whatever arguments were provided.

name

The name of this entity used for multipart/form-data as defined in rfc7578

new_body

This is a convenient constructor to instantiate a new entity body. It takes a single argument, one of file, form, scalar or string

The constructor of each of those classes are passed whatever argument is provided to this method (except, of course, the initial argument).

For example:

    my $body = $ent->new_body( file => '/some/where/file.txt' );
    my $body = $ent->new_body( string => 'Hello world!' );
    my $body = $ent->new_body( string => \$scalar );
    # Same, but using indistinctly 'scalar'
    my $body = $ent->new_body( scalar => \$scalar );

It returns the newly instantiated object upon success, or upon error, sets an error and returns undef.

open

This calls open on the entity body object, if any, and passing it whatever argument was provided.

It returns the resulting filehandle object, or upon error, sets an error and returns undef.

output_dir

Sets or gets the path to the directory used to store extracted files, when applicable.

parts

Sets or gets the array object of entity part objects.

preamble

Sets or gets the array object of preamble lines. preamble is the lines of text that precedes the first part in a multipart message. Normally, this is never used in HTTP parlance.

print

Provided with a filehandle, or an HTTP::Promise::IO object, and an hash or hash reference of options and this will print the current entity with all its parts, if any.

What this does internally is:

1. Call "print_start_line"
2. Call "print_header"
3. Call "print_body"

The only supported option is eol which is the string to be used as a new line terminator. This is printed out just right after printing the headers. This defaults to \015\012, which is \r\n

It returns the current entity object upon success, or upon error, sets an error and returns undef.

Provided with a filehandle, or an HTTP::Promise::IO object, and an hash or hash reference of options and this will print the current entity body. This is possibly is a no-op if there is no entity body.

If the entity is a multipart message, this will call "print" on all its entity parts.

It returns the current entity object upon success, or upon error, sets an error and returns undef.

Provided with a filehandle, or an HTTP::Promise::IO object, and an hash or hash reference of options and this will print the current entity body.

This will first encode the body by calling "encode" if encodings are set and the entity body is not yet marked as being encoded with "is_encoded"

Supported options are:

  • binmode

    The character encoding to use for PerlIO when calling open.

It returns the current entity object upon success, or upon error, sets an error and returns undef.

This calls "print" in HTTP::Promise::Headers, passing it whatever arguments were provided, and returns whatever value is returned from this method call. This is basically a convenient shortcut.

Provided with a filehandle, and an hash or hash reference of options and this will print the message start line, if any.

A message start line in HTTP parlance is the first line of a request or response, so something like:

    GET / HTTP/1.0

or for a response:

    HTTP/1.0 200 OK

It returns the current entity object upon success, or upon error, sets an error and returns undef.

purge

This calls purge on the body object, if any, and calls it also on every parts.

It returns the current entity object upon success, or upon error, sets an error and returns undef.

save_file

Provided with an optional filepath and this will save the body to it unless this is an HTTP multipart message.

If no explicit filepath is provided, this will try to guess one from the Content-Disposition header value, possibly striping it of any dangerous characters and making it a complete path using "output_dir"

If no suitable filename could be found, ultimately, this will use a generated one using "new_tempfile" in Module::Generic inherited by this class.

The file extension will be guessed from the entity body mime-type by checking the Content-Type header or by looking directly at the entity body data using HTTP::Promise::MIME that uses the XS module File::MMagic::XS to perform the job.

If the entity body is encoded, it will decode it before saving it to the resulting filepath.

It returns the file object upon success, or upon error, sets an error and returns undef.

stringify

This is an alias for "as_string"

stringify_body

This is an alias for "body_as_string"

stringify_header

This is an alias for "as_string" in HTTP::Promise::Headers

suggest_encoding

Based on the entity body mime-type, this will guess what encoding is appropriate.

It does not provide any encoding for image, audio or video files who are usually already compressed and if the body size is below the threshold set with "compression_min".

This returns the encoding as a string upon success, an empty string if no suitable encoding could be found, or upon error, sets an error and returns undef.

textual_type

Returns true if this entity mime-type starts with text, such as text/plain or text/html or starts with message, such as message/http

AUTHOR

Jacques Deguest <jack@deguest.jp>

SEE ALSO

rfc2616 section 3.7.2 Multipart Types
rfc2046 section 5.1.1 Common Syntax
rfc2388 multipart/form-data
rfc2045

Mozilla documentation on Content-Disposition and international filename and other Mozilla documentation

Wikipedia

On Unicode

HTTP::Promise, HTTP::Promise::Request, HTTP::Promise::Response, HTTP::Promise::Message, HTTP::Promise::Entity, HTTP::Promise::Headers, HTTP::Promise::Body, HTTP::Promise::Body::Form, HTTP::Promise::Body::Form::Data, HTTP::Promise::Body::Form::Field, HTTP::Promise::Status, HTTP::Promise::MIME, HTTP::Promise::Parser, HTTP::Promise::IO, HTTP::Promise::Stream, HTTP::Promise::Exception

COPYRIGHT & LICENSE

Copyright(c) 2022 DEGUEST Pte. Ltd.

All rights reserved This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.