Jean-Damien Durand
and 1 contributors

NAME

MarpaX::RFC::RFC3987 - Internationalized Resource Identifier (IRI): Generic Syntax - Marpa Parser

VERSION

version 0.001

SYNOPSIS

    use MarpaX::RFC::RFC3987;
    use Try::Tiny;
    use Data::Dumper;

    print Dumper(MarpaX::RFC::RFC3987->new('http://www.perl.org'));

    try {
      print STDERR "\nThe following is an expected failure:\n";
      MarpaX::RFC::RFC3987->new('http://invalid##');
    } catch {
      print STDERR "$_\n";
      return;
    }

DESCRIPTION

This module parses an IRI reference as per RFC3987. It is intended as a data validation module using a strict grammar with good error reporting.

IRI DESCRIPTION

Quoted from the URI RFC 3986, with which an IRI is sharing the same principle, here is the overall structure of an URI that will help understand the meaning of the methods thereafter:

         foo://example.com:8042/over/there?name=ferret#nose
         \_/   \______________/\_________/ \_________/ \__/
          |           |            |            |        |
       scheme     authority       path        query   fragment
          |   _____________________|__
         / \ /                        \
         urn:example:animal:ferret:nose

The grammar is parsing both absolute IRI and relative IRI, the corresponding start rule being named a IRI reference.

An absolute IRI has the following structure:

         IRI = scheme ":" ihier-part [ "?" iquery ] [ "#" ifragment ]

while a relative IRI is split into:

         irelative-ref  = irelative-part [ "?" iquery ] [ "#" ifragment ]

Back to the overall structure, the authority is:

         iauthority   = [ iuserinfo "@" ] ihost [ ":" port ]

where the host can be an IP-literal with Zone information, and IPV4 address or a registered name:

         host = IP-literal / IPv4address / ireg-name

The Zone Identifier is an extension to original URI RFC3986, is defined in RFC6874, and has been applied into the IRI grammar (the current IRI spec just says it does not support Zone Identifiers); it is an IPv6addrz:

         IP-literal = "[" ( IPv6address / IPv6addrz / IPvFuture  ) "]"

         ZoneID = 1*( iunreserved / pct-encoded )

         IPv6addrz = IPv6address "%25" ZoneID

CLASS METHODS

MarpaX::RFC::RFC3987->new(@options --> InstanceOf['MarpaX::RFC::RFC3987'])

Instantiate a new object. Usage is either MarpaX::RFC::RFC3987->new(value => $iri) or MarpaX::RFC::RFC3987->new($iri). This method will croak if the the $iri parameter cannot coerce to a string nor is a valid IRI. The variable $self is used below to refer to this object instance.

MarpaX::RFC::RFC3987->grammar( --> InstanceOf['Marpa::R2::Scanless::G'])

A Marpa::R2::Scanless::G instance, hosting the computed grammar. This is a class variable, i.e. works also with $self.

MarpaX::RFC::RFC3987->bnf( --> Str)

The BNF grammar used to parse an IRI. This is a class variable, i.e. works also with $self.

OBJECT METHODS

$self->value( --> Str)

The variable given in input to new().

$self->scheme( --> Str|Undef)

The IRI scheme. Can be undefined.

$self->iauthority( --> Str|Undef)

The IRI authority. Can be undefined.

$self->ipath( --> Str)

The IRI path. Note that an IRI always have a path, although it can be empty.

$self->iquery( --> Str|Undef)

The IRI query. Can be undefined.

$self->ifragment( --> Str|Undef)

The IRI fragment. Can be undefined.

$self->ihier_part( --> Str|Undef)

The IRI hier part. Can be undefined.

$self->iuserinfo( --> Str|Undef)

The IRI userinfo. Can be undefined.

$self->ihost( --> Str|Undef)

The IRI host. Can be undefined.

$self->port( --> Str|Undef)

The IRI port. Can be undefined.

$self->irelative_part( --> Str|Undef)

The IRI relative part. Can be undefined.

$self->ip_literal( --> Str|Undef)

The IRI IP literal. Can be undefined.

$self->zoneid( --> Str|Undef)

The IRI IP's zone id. Can be undefined.

$self->ipv4address( --> Str|Undef)

The IRI IP Version 4 address. Can be undefined.

$self->ireg_name( --> Str|Undef)

The IRI registered name. Can be undefined.

$self->is_absolute( --> Bool)

Returns a true value if the IRI is absolute, false otherwise.

SEE ALSO

Marpa::R2

IRI

Uniform Resource Identifier (URI): Generic Syntax

Internationalized Resource Identifier (IRI): Generic Syntax

Formats for IPv6 Scope Zone Identifiers in Literal Address Formats

SUPPORT

Bugs / Feature Requests

Please report any bugs or feature requests through the issue tracker at https://rt.cpan.org/Public/Dist/Display.html?Name=MarpaX-RFC-RFC3987. You will be notified automatically of any progress on your issue.

Source Code

This is open source software. The code repository is available for public review and contribution under the terms of the license.

https://github.com/jddurand/marpax-rfc-rfc3987

  git clone git://github.com/jddurand/marpax-rfc-rfc3987.git

AUTHOR

Jean-Damien Durand <jeandamiendurand@free.fr>

COPYRIGHT AND LICENSE

This software is copyright (c) 2015 by Jean-Damien Durand.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.