Protocol::HTTP::RequestParser - HTTP request parser
use Protocol::HTTP::RequestParser; my $parser = Protocol::HTTP::RequestParser->new; my $buffer = "GET / HTTP/1.0\r\n". "Host: crazypanda.ru\r\n". "Langs: Perl, c++\r\n". "\r\n"; my ($req, $state, $pos, $err) = $parser->parse($buffer); if ($err) { die "http error: $err"; } if ($state < Protocol::HTTP::Message::STATE_DONE) { # wait for more data } process($req);
This class represents client HTTP request parser. Parser is incremental so that you don't need to pass the whole http packet at once.
Parser is an FSM so it's really fast.
Constructs new request parser instance.
my ($request, $state, $position, $error) = $parser->parse($buffer);
Parses (possibly partial) HTTP request.
The first value returned is a Protocol::HTTP::Request object. Regardless of whether parsing the request is completed or not yet, this object will always be returned. Properties of this object will be partially or fully (depending on the state of parsing) filled with values.
The second value returned is a state of parsing. State may be
This is initial state and parsing process won't leave this state until all headers arrive. After leaving this state properties uri(), method() (or code() and message() in case of parsing response), http_version() and headers() are fully completed.
uri()
method()
code()
message()
http_version()
headers()
The next state after this may be either STATE_BODY, STATE_CHUNK or STATE_DONE depending on the headers received
STATE_BODY
STATE_CHUNK
STATE_DONE
Parser wants more data for message body (for messages without http chunks). During this state property body gets filled. You don't have to wait until all the body arrives to process it. It is okay to read whatever is there, process it, clear and wait for next data part.
body
my $data_part = $message->body; # process or write $data_part somewhere $message->body(""); # if you don't do this, next time you'll get the previous data part plus the one just arrived
Parser is waiting for chunk header (for messages with http chunks).
Parser wants more data for message chunk body (for messages with http chunks). Parser acts exactly like in STATE_BODY case, continuously collecting body property.
Parser is waiting for chunk trailer (for messages with http chunks).
Parser has finished parsing current message
Parser encountered an http protocol error. In this case the message object is still valid and its properties are left as they were at the moment the error occured. So you can still inspect what this message might look like (for example, if the error was in headers, uri() would be ok).
Next value returned is position in $buffer at which parsing process stopped.
$buffer
In case of error, position will be the character that caused that error.
In case of STATE_DONE, position will be the next character after the end of the message. Everything that is left after this position should probably be passed to parse() again (http pipelining).
parse()
Otherwise (no errors and not yet done), position will always be equal to the length of $buffer.
The last, 4th value is optional and is only returned if there was an error during parsing process. It is an XS::STL::ErrorCode object which represents Perl API for convenient C++ std::error_code subsystem. Possible errors are described in Protocol::HTTP::Error
std::error_code
my ($request, $state, $err) = $parser->parse_shift($buffer);
Parses HTTP request (same as parse()) and after that deletes from $buffer everything that have been consumed during parsing.
The effect is similar to
my ($request, $state, $position, $error) = $parser->parse($buffer); substr($buffer, 0, $position, '');
and thus $buffer can't be a read-only value, for example
$parser->parse_shift("constant string"); # WRONG! will die with "modification of read-only value ..."
The meaning and the behaviour of all other parameters are the same as in parse()
Resets internal parser state, so it is ready to parse new requests.
Parser automatically resets itself after each successfully parsed message, so you only need to call this method if you plan to re-use parser after errors, or you decided to stop parsing not yet fully parsed message and begin parsing another one.
Internally (in C++ API) it is also a zero-copy parser, however as it is not convenient and not efficient for Perl to use vectorized strings, one single copying occurs on XS->Perl border when you get body as a single string.
Protocol::HTTP
Protocol::HTTP::Message
Protocol::HTTP::Request
To install Protocol::HTTP, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Protocol::HTTP
CPAN shell
perl -MCPAN -e shell install Protocol::HTTP
For more information on module installation, please visit the detailed CPAN module installation guide.