The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Protocol::HTTP::RequestParser - HTTP request parser

SYNOPSIS

    use Protocol::HTTP::RequestParser;

    my $parser = Protocol::HTTP::RequestParser->new;
    my $buffer =
        "GET / HTTP/1.0\r\n".
        "Host: crazypanda.ru\r\n".
        "Langs: Perl, c++\r\n".
        "\r\n";

    my ($req, $state, $pos, $err) = $parser->parse($buffer);
    
    if ($err) {
        die "http error: $err";
    }
    
    if ($state < Protocol::HTTP::Message::STATE_DONE) {
        # wait for more data
    }
    
    process($req);

DESCRIPTION

This class represents client HTTP request parser. Parser is incremental so that you don't need to pass the whole http packet at once.

Parser is an FSM so it's really fast.

METHODS

new()

Constructs new request parser instance.

parse($buffer)

    my ($request, $state, $position, $error) = $parser->parse($buffer);

Parses (possibly partial) HTTP request.

The first value returned is a Protocol::HTTP::Request object. Regardless of whether parsing the request is completed or not yet, this object will always be returned. Properties of this object will be partially or fully (depending on the state of parsing) filled with values.

The second value returned is a state of parsing. State may be

Protocol::HTTP::Message::STATE_HEADERS

This is initial state and parsing process won't leave this state until all headers arrive. After leaving this state properties uri(), method() (or code() and message() in case of parsing response), http_version() and headers() are fully completed.

The next state after this may be either STATE_BODY, STATE_CHUNK or STATE_DONE depending on the headers received

Protocol::HTTP::Message::STATE_BODY

Parser wants more data for message body (for messages without http chunks). During this state property body gets filled. You don't have to wait until all the body arrives to process it. It is okay to read whatever is there, process it, clear and wait for next data part.

    my $data_part = $message->body;
    # process or write $data_part somewhere
    $message->body(""); # if you don't do this, next time you'll get the previous data part plus the one just arrived
    
Protocol::HTTP::Message::STATE_CHUNK

Parser is waiting for chunk header (for messages with http chunks).

Protocol::HTTP::Message::STATE_CHUNK_BODY

Parser wants more data for message chunk body (for messages with http chunks). Parser acts exactly like in STATE_BODY case, continuously collecting body property.

Protocol::HTTP::Message::STATE_CHUNK_TRAILER

Parser is waiting for chunk trailer (for messages with http chunks).

Protocol::HTTP::Message::STATE_DONE

Parser has finished parsing current message

Protocol::HTTP::Message::STATE_ERROR

Parser encountered an http protocol error. In this case the message object is still valid and its properties are left as they were at the moment the error occured. So you can still inspect what this message might look like (for example, if the error was in headers, uri() would be ok).

Next value returned is position in $buffer at which parsing process stopped.

In case of error, position will be the character that caused that error.

In case of STATE_DONE, position will be the next character after the end of the message. Everything that is left after this position should probably be passed to parse() again (http pipelining).

Otherwise (no errors and not yet done), position will always be equal to the length of $buffer.

The last, 4th value is optional and is only returned if there was an error during parsing process. It is an XS::STL::ErrorCode object which represents Perl API for convenient C++ std::error_code subsystem. Possible errors are described in Protocol::HTTP::Error

parse_shift($buffer)

    my ($request, $state, $err) = $parser->parse_shift($buffer);
    

Parses HTTP request (same as parse()) and after that deletes from $buffer everything that have been consumed during parsing.

The effect is similar to

    my ($request, $state, $position, $error) = $parser->parse($buffer);
    substr($buffer, 0, $position, '');
    

and thus $buffer can't be a read-only value, for example

    $parser->parse_shift("constant string"); # WRONG! will die with "modification of read-only value ..."

The meaning and the behaviour of all other parameters are the same as in parse()

reset()

Resets internal parser state, so it is ready to parse new requests.

Parser automatically resets itself after each successfully parsed message, so you only need to call this method if you plan to re-use parser after errors, or you decided to stop parsing not yet fully parsed message and begin parsing another one.

NOTE

Internally (in C++ API) it is also a zero-copy parser, however as it is not convenient and not efficient for Perl to use vectorized strings, one single copying occurs on XS->Perl border when you get body as a single string.

SEE ALSO

Protocol::HTTP

Protocol::HTTP::Message

Protocol::HTTP::Request