A calculation like "the factorial of a number" may be used several times in a large program. Subroutines allow this kind of functionality to be abstracted into a unit. It's a benefit for code reuse and maintainability. Even though PASM is just an assembly language for a virtual processor, it has a number of features to support high-level subroutine calls. PIR offers a smoother interface to those features.
PIR provides several different sets of syntax for subroutine calls. This is a language designed to implement other languages, and every language does subroutine calls a little differently. What's needed is a set of building blocks and tools, not a single prepackaged solution.
As we mentioned in the previous chapter, Parrot defines a set of calling conventions for externally visible subroutines. In these calls, the caller is responsible for preserving its own registers, and arguments and return values are passed in a predefined set of Parrot registers. The calling conventions use the Continuation Passing Style to pass control to subroutines and back again.
The fact that the Parrot calling conventions are clearly defined also makes it possible to provide some higher-level syntax for it. Manually setting up all the registers for each subroutine call isn't just tedious, it's also prone to bugs introduced by typos. PIR's simplest subroutine call syntax looks much like a high-level language. This example calls the subroutine _fact with two arguments and assigns the result to $I0:
_fact
$I0
($I0, $I1) = _fact(count, product)
This simple statement hides a great deal of complexity. It generates a subroutine object and stores it in P0. It assigns the arguments to the appropriate registers, assigning any extra arguments to the overflow array in P3. It also sets up the other registers to mark whether this is a prototyped call and how many arguments it passes of each type. It calls the subroutine stored in P0, saving and restoring the top half of all register frames around the call. And finally, it assigns the result of the call to the given temporary register variables (for a single result you can drop the parentheses). If the one line above were written out in basic PIR it would be something like:
P0
P3
newsub P0, .Sub, _fact I5 = count I6 = product I0 = 1 I1 = 2 I2 = 0 I3 = 0 I4 = 0 savetop invokecc restoretop $I0 = I5 $I1 = I6
The PIR code actually generates an invokecc opcode internally. It not only invokes the subroutine in P0, but also generates a new return continuation in P1. The called subroutine invokes this continuation to return control to the caller.
invokecc
P1
The single line subroutine call is incredibly convenient, but it isn't always flexible enough. So PIR also has a more verbose call syntax that is still more convenient than manual calls. This example pulls the subroutine _fact out of the global symbol table and calls it:
find_global $P1, "_fact" .begin_call .arg count .arg product .call $P1 .result $I0 .end_call
The whole chunk of code from .begin_call to .end_call acts as a single unit. The .begin_call directive can be marked as prototyped or unprototyped, which corresponds to the flag I0 in the calling conventions. The .arg directive sets up arguments to the call. The .call directive saves top register frames, calls the subroutine, and restores the top registers. The .result directive retrieves return values from the call.
.begin_call
.end_call
prototyped
unprototyped
I0
.arg
.call
.result
In addition to syntax for subroutine calls, PIR provides syntax for subroutine definitions. The .param directive pulls parameters out of the registers and creates local named variables for them:
.param
.param int c
The .begin_return and .end_return directives act as a unit much like the .begin_call and .end_call directives:
.begin_return
.end_return
.begin_return .return p .end_return
The .return directive sets up return values in the appropriate registers. After all the registers are set up the unit invokes the return continuation in P1 to return control to the caller.
.return
Here's a complete code example that reimplements the factorial code from the previous section as an independent subroutine. The subroutine _fact is a separate compilation unit, assembled and processed after the _main function. Parrot resolves global symbols like the _fact label between different units.
_main
# factorial.pir .sub _main .local int count .local int product count = 5 product = 1 $I0 = _fact(count, product) print $I0 print "\n" end .end .sub _fact .param int c .param int p loop: if c <= 1 goto fin p = c * p dec c branch loop fin: .begin_return .return p .end_return .end
This example defines two local named variables, count and product, and assigns them the values 1 and 5. It calls the _fact subroutine passing the two variables as arguments. In the call, the two arguments are assigned to consecutive integer registers, because they're stored in typed integer variables. The _fact subroutine uses .param and the return directives for retrieving parameters and returning results. The final printed result is 120.
count
product
You may want to generate a PASM source file for the above example to look at the details of how the PIR code translates to PASM:
$ parrot -o- factorial.pir
The example above could have been written using simple labels instead of separate compilation units:
.sub _main $I1 = 5 # counter call fact # same as bsr fact print $I0 print "\n" $I1 = 6 # counter call fact print $I0 print "\n" end fact: $I0 = 1 # product L1: $I0 = $I0 * $I1 dec $I1 if $I1 > 0 goto L1 ret .end
The unit of code from the fact label definition to ret is a reusable routine. There are several problems with this simple approach. First, the caller has to know to pass the argument to fact in $I1 and to get the result from $I0. Second, neither the caller nor the function itself preserves any registers. This is fine for the example above, because very few registers are used. But if this same bit of code were buried deeply in a math routine package, you would have a high risk of clobbering the caller's register values.
fact
ret
$I1
Another disadvantage of this approach is that _main and fact share the same compilation unit, so they're parsed and processed as one piece of code. When Parrot does register allocation, it calculates the data flow graph (DFG) of all symbols,The operation to calculate the DFG has a quadratic cost or better. It depends on n_lines * n_symbols. looks at their usage, calculates the interference between all possible combinations of symbols, and then assigns a Parrot register to each symbol. This process is less efficient for large compilation units than it is for several small ones, so it's better to keep the code modular. The optimizer will decide whether register usage is light enough to merit combining two compilation units, or even inlining the entire function.
PIR code can include pure PASM compilation units. These are wrapped in the .emit and .eom directives instead of .sub and .end. The .emit directive doesn't take a name, it only acts as a container for the PASM code. These primitive compilation units can be useful for grouping PASM functions or function wrappers. Subroutine entry labels inside .emit blocks have to be global labels:
.emit
.eom
.sub
.end
.emit _substr: ... ret _grep: ... ret .eom
PIR provides syntax to simplify writing methods and method calls for object-oriented programming. These calls follow the Parrot calling conventions as well. First we want to discuss namespaces in Parrot.
Namespaces provide a mechanism where names can be reused. This may not sound like much, but in large complicated systems, or systems with many included libraries, it can become a big hassle very quickly. Each namespace get's it's own area for function names and global variables. This way, you can have multiple functions named create or new or convert, for instance, without having to use Multi-Method Dispatch (MMD), which we will describe later.
create
new
convert
Namespaces are specified with the .namespace [] directive. The brackets are themselves not optional, but the keys inside them are. Here are some examples:
.namespace []
.namespace [ ] # The root namespace .namespace [ "Foo" ] # The namespace "Foo" .namespace [ "Foo" ; "Bar" ] # Namespace Foo::Bar
Using semicolons, namespaces can be nested to any arbitrary depth. Namespaces are special types of PMC, so we can access them and manipulate them just like other data objects. We can get the PMC for the root namespace using the get_root_namespace opcode:
get_root_namespace
$P0 = get_root_namespace
The current namespace, which might be different from the root namespace can be retrieved with the get_namespace opcode:
get_namespace
$P0 = get_namespace # get current namespace $P0 = get_namespace [ "Foo" ] # get PMC for namespace "Foo"
Once we have a namespace PMC, we can call functions in it, or retrieve global variables from it using the following functions:
$P1 = get_global $S0 # Get global in current namespace $P1 = get_global [ "Foo" ], $S0 # Get global in namespace "Foo" $P1 = get_global $P0, $S0 # Get global in $P0 namespace PMC
In the examples above, of course, $S0 contains the string name of the global variable or function from the namespace to find.
$S0
Now that we've discussed namespaces, we can start to discuss object-oriented programming and method calls. The basic syntax is similar to the single line subroutine call above, but instead of a subroutine label name it takes a variable for the invocant PMC and a string with the name of the method:
object."methodname"(arguments)
The invocant can be a variable or register, and the method name can be a literal string, string variable, or method object register. This tiny bit of code sets up all the registers for a method call and makes the call, saving and restoring the top half of the register frames around the call. Internally, the call is a callmethodcc opcode, so it also generates a return continuation.
callmethodcc
This example defines two methods in the Foo class. It calls one from the main body of the subroutine and the other from within the first method:
Foo
.sub _main .local pmc class .local pmc obj newclass class, "Foo" # create a new Foo class new obj, "Foo" # instantiate a Foo object obj."_meth"() # call obj."_meth" which is actually print "done\n" # "_meth" in the "Foo" namespace end .end .namespace [ "Foo" ] # start namespace "Foo" .sub _meth :method # define Foo::_meth global print "in meth\n" $S0 = "_other_meth" # method names can be in a register too self.$S0() # self is the invocant .end .sub _other_meth :method # define another method print "in other_meth\n" # as above Parrot provides a return .end # statement
Each method call looks up the method name in the symbol table of the object's class. Like .pccsub in PASM, .sub makes a symbol table entry for the subroutine in the current namespace.
.pccsub
When a .sub is declared as a method, it automatically creates a local variable named self and assigns it the object passed in P2.
method
self
P2
You can pass multiple arguments to a method and retrieve multiple return values just like a single line subroutine call:
(res1, res2) = obj."method"(arg1, arg2)
9 POD Errors
The following errors were encountered while parsing the POD:
A non-empty Z<>
Deleting unknown formatting code N<>
To install Parrot::Op, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parrot::Op
CPAN shell
perl -MCPAN -e shell install Parrot::Op
For more information on module installation, please visit the detailed CPAN module installation guide.