The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Perl6::Overview -- a brief introduction and overview of Perl 6

DESCRIPTION

This introduction is aimed at the beginning Perl 6 programmer. For the moment, it is assumed that such programmers are coming from a background of Perl 5. However, this document tries to be simple and general enough for anyone with some programming experience to pick up Perl 6. Those who want more information about the changes from Perl 5 should look elsewhere.

What is Perl 6?

Perl 6, like its predecessors, is a multi-purpose dynamic language combining ease of use and powerful programming features.

Running a Perl 6 program

You will need to install Pugs, which can be found at http://pugscode.org/.

Having done so, you can run your Perl 6 program from the command line as follows:

    pugs myprogram.pl

To run one-liners from the command-line, use the -e flag:

    pugs -e 'say "Hello, world!"'

You can also start up pugs without any arguments, then type Perl 6 commands at its command line. To exit, type ctrl-D or :q.

Basic syntax overview

A Perl 6 program consists of one or more statements. These statements are simply written in a plain text file, one after another. There is no need to have a main() function or anything of that kind.

Perl 6 statements end in a semi-colon:

   say "Hello, world";

Comments start with a hash symbol and run to the end of the line

   # This is a comment

Whitespace is irrelevant:

   say 
       "Hello, world"
       ;

... except inside quoted strings:

   # this would print with a linebreak in the middle
   say "Hello
   world";

Double quotes or single quotes may be used around literal strings:

   say "Hello, world";
   say 'Hello, world';

However, only double quotes "interpolate" variables and special characters such as newlines (\n):

   my $name = 'Johnny';
   print "Hello, $name\n";   # prints: Hello, Johnny  (followed by a newline)
   print 'Hello, $name\n';   # prints: Hello, $name\n (no newline)

Numbers don't need quotes around them:

   say 42;

Most of the time, you can use parentheses for functions' arguments or omit them according to your personal taste.

   say("Hello, world");
   say "Hello, world";

Perl variable types

Perl has three main variable types: scalars, arrays, and hashes. Variables are declared with my.

Scalars

A scalar represents a single value:

    my $animal = "camel";
    my $answer = 42;

Scalar variables start with dollar signs. Scalar values can be strings, integers or floating point numbers, and Perl will automatically convert between them as required. There is no need to pre-declare your variable types (though you can if you want -- see XXX).

Scalar values can be used in various ways:

    say $animal;
    say "The animal is $animal";
    say "The square of $answer is ", $answer * $answer;

There is a "magic" scalar with the name $_, and it is referred to as the "topic". It's used as the default argument to a number of functions in Perl, and it's set implicitly by certain constructs (so-called "topicalizing" constructs).

    say;          # prints contents of $_ by default
Arrays

Array variables start with an at sign, and they represent lists of values:

    my @animals = ("camel", "llama", "owl");
    my @numbers = (23, 42, 69);
    my @mixed   = ("camel", 42, 1.23);

Arrays are zero-indexed. Here's how you get at elements in an array:

    say @animals[0];              # prints "camel"
    say @animals[1];              # prints "llama"

The numeric index of the last element of an array can by found with @array.end:

    say @animals[@animals.end];   # prints "owl"

However, negative indices count backwards from the end of the list, so that could also have been written:

    say @animals[-1];             # prints "owl"

To find the number of elements in an array, use the elems method:

    say @mixed.elems;       # prints 3

To get multiple values from an array:

    @animals[0,1];                  # gives ("camel", "llama");
    @animals[0..2];                 # gives ("camel", "llama", "owl");
    @animals[1..@animals.end];      # gives all except the first element

This is called an "array slice".

You can do various useful things to lists:

    my @sorted    = @animals.sort;
    my @backwards = @numbers.reverse;

There are a couple of special arrays too, such as @*ARGS (the command line arguments to your script) and @_ (the arguments passed to a subroutine, if formal parameters are not declared). These are documented in XXX.

Hashes

Hash variables start with a percent sign, and represent sets of key/value pairs:

    my %fruit_color = ("apple" => "red", "banana" => "yellow");

You can use whitespace and the => operator to lay them out more nicely:

    my %fruit_color = (
        apple  => "red",
        banana => "yellow",
    );

To get at hash elements:

    %fruit_color{"apple"};           # gives "red"

You can get at lists of keys and values with keys() and values().

    my @fruits = %fruit_colors.keys;    # ("apple", "banana")
    my @colors = %fruit_colors.values;  # ("red", "yellow")

Hashes have no particular internal order, though you can sort the keys and loop through them.

Just like special scalars and arrays, there are also special hashes. The most well known of these is %*ENV which contains environment variables. Read all about it (and other special variables) in XXX.

Scalars, arrays and hashes are documented more fully in perldata.

More complex data types can be constructed using references, which allow you to build lists and hashes within lists and hashes.

A reference is a scalar value and can refer to any other Perl data type. So by storing a reference as the value of an array or hash element, you can easily create lists and hashes within lists and hashes. The following example shows a 2 level hash of hash structure using anonymous hash references.

    my $variables = {
        scalar  =>  {
                     description => "single item",
                     sigil => '$',
                    },
        array   =>  {
                     description => "ordered list of items",
                     sigil => '@',
                    },
        hash    =>  {
                     description => "key/value pairs",
                     sigil => '%',
                    },
    };

    print "Scalars begin with a $variables{'scalar'}{'sigil'}\n";

Exhaustive information on the topic of references can be found in perlreftut, perllol, perlref and perldsc.

Variable scoping

[XXX Please check and correct this... just my best attempt --aufrank]

Variables in Perl 6 are put into one of several namespaces based on where and how they are declared. The basic scopes available are global, package, and lexical.

    use GLOBAL <$FOO>;  # globally-scoped $FOO
    $*FOO;              # same as above
    our $bar;           # package-scoped $bar
    my $baz;            # lexically-scoped $baz

Globally-scoped variables can be used anywhere within the current package, and can be used by any packages that use the current package. Package-scoped variables can be used anywhere in the current package, but cannot be accessed from outside the package. Lexically-scoped variables can only be used within the scope of the nearest enclosing braces.

    our $foo;
    sub bar {
        $foo++;         # OK, package-scoped $foo

        my $baz;        # lexically-scoped $baz
    }
    $baz++;             # incorrect!  $baz is not in the package scope

By default, variables declared in the root of the package are package scoped. By default, variables declared within a block are lexically scoped. This includes variables declared in classes, objects, methods, subroutines, anonymous closures, rules, and conditional and looping constructs.

There are also a number of specialized scopes available. Many of these scopes are declared and accessed through the use of secondary sigils, or 'twigils'.

    $foo               # ordinary scoping
    $.foo              # object attribute accessor
    $^foo              # self-declared formal parameter
    $*foo              # global variable
    $+foo              # environmental variable
    $?foo              # compiler hint variable
    $=foo              # pod variable
    $!foo              # explicitly private attribute (mapped to $foo though)

Most variables with twigils are implicitly declared or assumed to be declared in some other scope, and don't need a "my" or "our". Attribute variables are declared with "has", though, and environment variables are declared somewhere in the dynamic scope with the "env" declarator.

By default, a subroutine or method hides variables declared within its lexical scope (with the exception of $_ ). Variables declared within a subroutine or method cannot usually be accessed when the subroutine or method is called. To change this, declare the variable with env.

    [XXX Probably should include an example declaring and accessing an ENV
    variable]

Variables within certain namespaces can be accessed through that namespace's symbol table. Available symbol tables (or 'pseudo-packages') include:

    MY           # lexical variables
                 # declared with 'my'
    OUR          # package variables
                 # declared with 'our'
    GLOBAL       # global scoped variables
                 # declared with 'use GLOBAL' or $* twigil
    ENV          # environmental variables
                 # declared with 'env' or $+
    OUTER        # the immediately surrounding 'MY' scope
    CALLER       # the lexical scope of the caller
    SUPER        # ???
    COMPILING    # the compile-time scope of a variable, often used in macros
                 # declared with $?

Conditional and looping constructs

Perl has most of the usual conditional and looping constructs. The are usually written as construct-name condition { ... }

The conditions can be any Perl expression. See the list of operators in the next section for information on comparison and boolean logic operators, which are commonly used in conditional statements.

if
    if condition {
        ...
    } elsif ( other condition ) {
        ...
    } else {
        ...
    }

There's also a negated version of it:

    unless condition {
        ...
    }

This is provided as a more readable version of if not condition.

... is the "yada yada yada" operator. It is used as a place holder, which prints out a warning if executed.

Note that the braces are required in Perl, even if you've only got one line in the block. However, there is a clever way of making your one-line conditional blocks more English like:

    # the traditional way
    if $zippy {
        print "Yow!";
    }

    # the Perlish post-condition way
    print "Yow!" if $zippy;
    print "We have no bananas" unless $bananas;
while
    while condition {
        ...
    }

There's also a negated version, for the same reason we have unless:

    until condition {
        ...
    }

You can also use while in a post-condition:

    print "LA LA LA\n" while 1;          # loops forever
loop

The loop functions exactly like the C's <for> It is rarely needed in Perl since Perl provides the more friendly list scanning for.

    loop (my $i=0; $i <= $max; $i++) {
        ...
    }

Can be expressed like:

    my $i=0;
    while($i <= $max; ) {
        ...;
        $i++;
    }

If you want to create a neverending loop use loop without any arguments

    loop {
        ... 
    }
for
    for @array {
        print "This element is $_\n";
    }

    # you don't have to use the default $_ either...
    for %hash.keys -> $key {
        print "The value of $key is %hash{$key}\n";
    }

Built-in operators and functions

Perl comes with a wide selection of builtin functions. Some of the ones we've already seen include "print", "sort" and "reverse". A list of them is given at the start of perlfunc and you can easily read about any given function by using "perldoc -f functionname".

Perl operators are documented in full in perlop, but here are a few of the most common ones:

Arithmetic + addition - subtraction * multiplication / division ** exponentiation % modulo

Numeric comparison == equality != inequality < less than > greater than <= less than or equal => greater than or equal

String comparison eq equality ne inequality lt less than gt greater than le less than or equal ge greater than or equal

(Why do we have separate numeric and string comparisons? Because we don't have special variable types, and Perl needs to know whether to sort numerically (where 99 is less than 100) or alphabetically (where 100 comes before 99).

Smart comparison ~~ smart match (see smartmatch) !~~ negated smart match

Boolean logic && and || or ! not

("and", "or" and "not" aren't just in the above table as descriptions of the operators -- they're also supported as operators in their own right. They're more readable than the C-style operators, but have different precedence to "&&" and friends. Check perlop for more detail.)

Miscellaneous = assignment ~ string concatenation x string multiplication .. range operator (creates a list of items)

Many of the operators can be combined with an "=" as follows: $a += 1; # means $a = $a + 1; $a -= 1; # means $a = $a - 1; $a ~= " "; # means $a = $a ~ " ";

Files and I/O

Rules

Perl's rules are called regular expressions in other languges. Perl's regular expression support is both broad and deep, and is the subject of lengthy documentation in XXX, and elsewhere. However, in short:

Simple matching
    if /foo/       { ... }  # true if $_ contains "foo"
    if $a ~~ /foo/ { ... }  # true if $a contains "foo"

The // matching operator is documented in XXX. It operates on $_ by default, or can be bound to another variable using the ~~ smart match operator (also documented in XXX).

Simple substitution
    s/foo/bar/;               # replaces foo with bar in $_
    $a .= s/foo/bar/;         # replaces foo with bar in $a
    $a .= s/foo/bar/g;        # replaces ALL INSTANCES of foo with bar in $a

The s/// substitution operator is documented in XXX.

More complex regular expressions

You don't just have to match on fixed strings. In fact, you can match on just about anything you could dream of by using more complex regular expressions. These are documented at great length in XXX, but for the meantime, here's a quick cheat sheet:

    .                   a single character
    \N                  non-newline character
    \s                  a whitespace character (space, tab, newline)
    \S                  non-whitespace character
    \d                  a digit (0-9)
    \D                  a non-digit
    \w                  a word character (a-z, A-Z, 0-9, _)
    \W                  a non-word character
    <[aeiou]>           matches a single character in the given set
    <-[aeiou]>          matches a single character outside the given set
    [foo|bar|baz]       matches any of the alternatives specified
    (foo|bar|baz)       matches and captures any of the alternatives specified
    <foo>               matches the rule foo
    ^                   start of string
    $                   end of string

Quantifiers can be used to specify how many of the previous thing you want to match on, where "thing" means either a literal character, one of the metacharacters listed above, or a group of characters or metacharacters in parentheses.

    *                   zero or more of the previous thing
    +                   one or more of the previous thing
    ?                   zero or one of the previous thing
    **{3}               matches exactly 3 of the previous thing
    **{3 .. 6}          matches between 3 and 6 of the previous thing

Some brief examples:

    /^\d+/              string starts with one or more digits
    /^$/                nothing in the string (start and end are adjacent)
    /[\d\s]{3}/         string contains three digits, each followed by a
                        whitespace character (eg "3 4 5 ")
    /[a.]+/             matches a string in which every odd-numbered letter 
                        is a (eg "abacadaf")

    # This loop reads from STDIN, and prints non-blank lines:
    for (=<>) {
        next if /^$/;
        print;
    }
Parentheses for capturing

Additionaly to grouping, parentheses serve a second purpose. They can be used to capture the results of parts of the regexp match for later use. The results end up in $0, $1 and so on. Note that those variables are numbered from 0 not 1;

    # a cheap and nasty way to break an email address up into parts

    if ($email ~~ /(<-[@]>+)@(.+)/) {
        print "Username is $0\n";
        print "Hostname is $1\n";
    }
Other regexp features

Perl rules also support named rules, grammars, backreferences, lookaheads, and all kinds of other complex details. Read all about them in XXX.

Writing subroutines

Object oriented Perl 6

Using third-party modules

AUTHOR

Kirrily "Skud" Robert <skud@cpan.org> Shmarya <shmarya.rubenstein@gmail.com> Pawel Murias <13pawel@gazeta.pl>