The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

This chapter details the architecture and internal workings of Parrot, the interpreter behind Perl 6. Parrot is a register-based, bytecode-driven, object-oriented, multithreaded, dynamically typed, self-modifying, asynchronous interpreter. While that's an awful lot of buzzwords, the design fits together remarkably well.

Core Design Principles

Three main principles drive the design of Parrot--speed, abstraction, and stability.

Speed is a paramount concern. Parrot absolutely must be as fast as possible, since the engine effectively imposes an upper limit on the speed of any program running on it. It doesn't matter how efficient your program is or how clever your program's algorithms are if the engine it runs on limps along. While Parrot can't make a poorly written program run fast, it could make a well-written program run slowly, a possibility we find entirely unacceptable.

Speed encompasses more than just raw execution time. It extends to resource usage. It's irrelevant how fast the engine can run through its bytecode if it uses so much memory in the process that the system spends half its time swapping to disk. While we're not averse to using resources to gain speed benefits, we try not to use more than we need, and to share what we do use.

Abstraction indicates that things are designed such that there's a limit to what anyone needs to keep in their head at any one time. This is very important because Parrot is conceptually very large, as you'll see when you read the rest of the chapter. There's a lot going on, too much to keep the whole thing in mind at once. The design is such that you don't have to remember what everything does, and how it all works. This is true regardless of whether you're writing code that runs on top of Parrot or working on one of its internal subsystems.

Parrot also uses abstraction boundaries as places to cheat for speed. As long as it looks like an abstraction is being completely fulfilled, it doesn't matter if it actually is being fulfilled, something we take advantage of in many places within the engine. For example, variables are required to be able to return a string representation of themselves, and each variable type has a "give me your string representation" function we can call. That lets each class have custom stringification code, optimized for that particular type. The engine has no idea what goes on beneath the covers and doesn't care--it just knows to call that function when it needs the string value of a variable. Objects are another good case in point--while they look like nice, clean black boxes on the surface, under the hood we cheat profoundly.

Stability is important for a number of reasons. We're building the Parrot engine to be a good backend for many language compilers to target. We must maintain a stable interface so compiled programs can continue to run as time goes by. We're also working hard to make Parrot a good interpreter for embedded languages, so we must have a stable interface exposed to anyone who wants to embed us. Finally, we want to avoid some of the problems that Perl 5 has had over the years that forced C extensions written to be recompiled after an upgrade. Recompiling C extensions is annoying during the upgrade and potentially fraught with danger. Such backward-incompatible changes have sometimes been made to Perl itself.

Parrot's Architecture

The Parrot system is divided into four main parts, each with its own specific task. The diagram in CHP-8-FIG-1Figure 8-1 shows the parts, and the way source code and control flows through Parrot. Each of the four parts of Parrot are covered briefly here, with the features and parts of the interpreter covered in more detail afterward.

The flow starts with source code, which is passed into the parser module. The parser processes that source into a form that the compiler module can handle. The compiler module takes the processed source and emits bytecode, which Parrot can directly execute. That bytecode is passed into the optimizer module, which processes the bytecode and produces bytecode that is hopefully faster than what the compiler emitted. Finally, the bytecode is handed off to the interpreter module, which interprets the bytecode. Since compilation and execution are so tightly woven in Perl, the control may well end up back at the parser to parse more code.

Parrot's compiler module also has the capability to freeze bytecode to disk and read that frozen bytecode back again, bypassing the parser and compilation phases entirely. The bytecode can be directly executed, or handed to the optimizer to work on before execution. This may happen if you've loaded in a precompiled library and want Parrot to optimize the combination of your code and the library code. The bytecode loader is interesting in its own right, and also warrants a small section.

Parser

The parser module is responsible for taking source code in and turning it into an Abstract Syntax Tree (AST). An AST is a digested form of the program, one that's much more amenable to manipulation. In some systems this task is split into two parts--the lexing and the parsing--but since the tasks are so closely bound, Parrot combines them into a single module.

Lexing (or tokenizing) turns a stream of characters into a stream of tokens. It doesn't assign any meaning to those tokens--that's the job of the parser--but it is smart enough to see that $a = 1 + 2; is composed of 6 tokens ($, a, =, 1, +, and 2).

Parsing is the task of taking the tokens that the lexer has found and assigning some meaning to them. Sometimes the parsed output can be directly executed.

Parsing can be a chore, as anyone who's done it before knows. In some cases it can be downright maddening--Perl 5's parser has over ten thousand lines of C code. Utility programs such as lex and yacc are often used to automate the generation of parser code. Perl 5 itself uses a yacc-processed grammar to handle some of the task of parsing Perl code.yacc can handle only part of the task, though. As the quote goes, "The task of parsing Perl is divided between lex, yacc, smoke, and mirrors." Rather than going with a custom-built parser for each language, Parrot provides a general-purpose parser built on top of Perl 6's grammar engine, with hooks for calling out to special-purpose code where necessary. Perl 6 grammars are designed to be powerful enough to handle parsing Perl, so it made good sense to leverage the engine as a general-purpose parser. Parrot provides some utility code to transform a yacc grammar into a Perl 6 grammar, so languages that already use yacc can be moved over to Parrot's parser with a minimum amount of fuss. This allows you to use a yacc grammar instead of a Perl 6 grammar to describe the language being parsed, both because many languages already have their grammars described with yacc and because a yacc grammar is sometimes a more appropriate way to describe things.

Parrot does support independent parsers for cases where the Perl 6 grammar engine isn't the appropriate choice. A language might already have an existing parser available, or different techniques might be in order. The Perl 5 parsing engine may get embedded this way, as it's easier to embed a quirky existing parser than it is to recreate all the quirks in a new parser.

Compiler

The compiler module takes the AST that the parser generates and turns it into code that the interpreter engine can execute. This translation is very straightforward. It involves little more than flattening the AST and running the flattened tree though a series of substitutions.

The compiler is the least interesting part of Parrot. It transforms one machine representation of your program--the AST that the parser generated--into another machine representation of your program--the bytecode that the interpreter needs. It's little more than a simple, rule-based filter module, albeit one that's necessary for Parrot to understand your source code.

For many languages the parser and compiler are essentially a single unit. Like the parser, the compiler is pluggable, so you can load in your own compiler. When not using the Perl 6 grammar engine, the compiler and parser modules will usually be loaded together. Parrot itself comes with two compiler modules for Parrot assembly and PIR.Parrot Intermediate Representation. See CHP-11Chapter 11. It's likely many compilers will actually emit either assembly or PIR code, rather than directly emitting bytecode.

Optimizer

The optimizer module takes the AST that the parser generated and the bytecode that the compiler generated, and transforms the bytecode to make it run faster.

Optimizing code for dynamic languages such as Perl, Python, and Ruby is an interesting task. The languages are so dynamic that the optimizer can't be sure how a program will actually run. For example, the code:

  $a = 0;
  for (1..10000) {
      $a++;
  }

looks straightforward enough. The variable $a starts at 0, gets incremented ten thousand times, and has an end value of 10000. A standard optimizer would turn that code into the single line:

  $a = 10000;

and remove the loop entirely. Unfortunately, that's not necessarily appropriate for Perl. $a could easily be tied, perhaps representing the position of some external hardware. If incrementing the variable ten thousand times smoothly moves a stepper motor from 0 to 10000 in increments of one, just assigning a value of 10000 to the variable might whip the motor forward in one step, damaging the hardware. A tied variable might also keep track of the number of times it has been accessed or modified. Either way, optimizing the loop away changes the semantics of the program in ways the original programmer didn't want.

Because of the potential for active or tied data, especially for languages as dynamically typed as Perl, optimizing is a non-trivial task. Other languages, such as C or Pascal, are more statically typed and lack active data, so an aggressive optimizer is in order for them. Breaking out the optimizer into a separate module allows us to add in optimizations piecemeal without affecting the compiler. There's a lot of exciting work going into the problem of optimizing dynamic languages, and we fully expect to take advantage of it where we can.

Optimization is potentially an expensive operation, another good reason to have it in a separate module. Spending ten seconds optimizing a program that will run in five seconds is a huge waste of time when using Perl's traditional compile-and-go model--optimizing the code will make the program run slower. On the other hand, spending ten seconds to optimize a program makes sense if you save the optimized version to disk and use it over and over again. Even if you save only one second per program run, it doesn't take long for the ten-second optimization time to pay off. The default is to optimize heavily when freezing bytecode to disk and lightly when running directly, but this can be changed with a command-line switch.

Perl 5, Python, and Ruby all lack a robust optimizer (outside their regular expression engines), so any optimizations we add will increase their performance. This, we feel, is a good thing.

Interpreter

The interpreter module is the part of the engine that executes the generated bytecode. Calling it an interpreter is something of a misnomer, since Parrot's core includes both a traditional bytecode interpreter module as well as a high-performance JIT engine, but you can consider that an implementation detail.

All the interesting things happen inside the interpreter, and the remainder of the chapter is dedicated to the interpreter and the functions it provides. It's not out of line to consider the interpreter as the real core of Parrot, and to consider the parser, compiler, and optimizer as utility modules whose ultimate purpose is to feed bytecode to the interpreter.

Bytecode Loader

The bytecode loader isn't part of our block diagram, but it is interesting enough to warrant brief coverage.

The bytecode loader handles loading in bytecode that's been frozen to disk. The Parrot bytecode loader is clever enough to handle loading in Parrot bytecode regardless of the sort of system that it was saved on, so we have cross-platform portability. You can generate bytecode on a 32-bit x86 system and load it up on a 64-bit Alpha or SPARC system without any problems.

The bytecode loading system also has a heuristic engine built into it, so it can identify the bytecode format it's reading. This means Parrot can not only tell what sort of system Parrot bytecode was generated on so it can properly process it, but also allows it to identify bytecode generated for other bytecode driven systems, such as .NET, the JVM, and the Z-machine.The Z-machine is the interpreter for Infocom text adventures, such as Zork and The Lurking Horror.

In addition to loading in bytecode, the loader is sufficiently clever to recognize source files for any language that has a registered compiler. It loads and compiles that source as if it were frozen bytecode.

Together with Parrot's loadable opcode library system (something we'll talk about later), this gives Parrot the capability to load in foreign bytecode formats and transform them into something Parrot can execute. With a sophisticated enough loader, Parrot can load and execute Java and .NET bytecode and present Java and .NET library code to languages that generate native Parrot bytecode. This is something of a happy accident. The original purpose of the architecture was to allow Parrot to load and execute Z-machine bytecode, but happy accidents are the best kind.

The Interpreter

The interpreter is the engine that actually runs the code emitted by the parser, compiler, and optimizer modules. The Parrot execution engine is a virtual CPU done completely in software. We've drawn on research in CPU and interpreter design over the past forty years to try and build the best engine to run dynamic languages.

That emphasis on dynamic languages is important. We are not trying to build the fastest C, Forth, Lisp, or Prolog engine. Each class of languages has its own quirks and emphasis, and no single engine will handle all the different types of languages well. Trying to design an engine that works equally well for all languages will get you an engine that executes all of them poorly.

That doesn't mean that we've ignored languages outside our area of primary focus--far from it. We've worked hard to make sure that we can accommodate as many languages as possible without compromising the performance of our core language set. We feel that even though we may not run Prolog or Scheme code as fast as a dedicated engine would, the flexibility Parrot provides to mix and match languages more than makes up for that.

Parrot's core design is that of a register rich CISC CPU, like many of the CISC machines of the past such as the VAX, Motorola 68000, and IBM System/3x0. Many of Parrot's basic instructions perform complex operations. It also bears some resemblance to modern RISC CPUs such as the IBM Power series and Intel Alpha,Formerly HP, formerly Compaq, formerly Digital Alpha. as it does all its operations on data in registers. Using a core design similar to older systems gives us decades of compiler research to draw on. Most compiler research since the early 1970s deals with targeting register systems of one sort or another.

Using a register architecture as the basis for Parrot goes against the current trends in virtual machines, which favor stack-based approaches. While a stack approach is simpler to implement, a register system provides a richer set of semantics. It's also just more pleasant for us assembly old-timers to write code for. Combined with the decades of sophisticated compiler research, we feel that it's the correct design decision.

Registers

Parrot has four basic types of registers: PMC, string, integer, and floating-point, one for each of the core data types in Parrot. PMCs, short for Parrot Magic Cookies, are the structures that represent high level variables such as arrays, hashes, scalars, and objects. We separate the register types for ease of implementation, garbage collection, and space efficiency. Since PMCs and strings are garbage-collectable entities, restricting what can access them--strings in string registers and PMCs in PMC registers--makes the garbage collector a bit faster and simpler. Having integers and floats in separate register sets makes sense from a space standpoint, since floats are normally larger than integers.

The current Parrot architecture provides 32 of each register type, for a total of 128 registers. While this may seem like overkill, compensating for running out of registers can be a significant speed hit, so it's in our best interests to make sure it happens rarely. 32 is a good compromise between performance and memory usage.

Stacks

Parrot has seven separate stacks, each with a specific purpose. The four register sets each have its own stack for quickly saving register contents. There is a separate stack dedicated to saving and restoring individual integers, which the regular expression system uses heavily. The control stack keeps track of control information, exception handlers, and other such things. Finally, the general-purpose typed stack stores individual values.

The backing stacks for the register sets are somewhat special. Operations on the register stacks don't act on single registers. The engine pushes and pops entire register sets in one operation. This may seem somewhat unusual, but it makes the primary use of these stacks--to save registers across function calls--very fast. A save or restore operation is essentially a single memory copy operation, something that's highly optimized just about everywhere.The SPARC processor, for example, has a cache-friendly memory copy as a core operation. The integer stack is specifically designed to hold integers. Since it doesn't have to be general purpose, integer stack operations can be faster than operations on the general-purpose stack--a speed gain the regular expression code makes use of. Regular expressions make heavy use of integer code, as they move back and forth within strings, and make heavy use of the integer stack to manage backtracking information.

The control stack is private to the interpreter, so user code can't directly access it. The interpreter engine uses it to manage exception handlers, return locations for function calls, and track other internal data. User code can inspect the stack through Parrot's introspective features.

Finally, the general-purpose stack is used to save and restore individual registers. It's a typed stack, so it doesn't allow you to do things like push an integer register onto the stack and pop the value into a string register. For compiled code, this stack is used if a routine needs more than 32 registers of the same type. The extra values are pushed on and popped off the stack in an operation called register spilling. This stack is also used when Parrot runs code designed for a stack machine such as the JVM or .NET. Stack-based code is less efficient than register-based code, but we can still run it.

All of Parrot's stacks are segmented--they're composed of a set of stack pieces instead of a single chunk of memory. Segmenting has a small performance impact, but it allows us to make better usage of available memory. Traditional stacks are composed of a single chunk of memory, since this makes it faster to read from and write to the stack. Usually when you run off the end of that chunk of memory your program crashes. To avoid this, most systems allocate a large stack. This isn't much of a problem if you have only a single stack, but it doesn't work well in today's multithreaded world, where each thread has to have its own stack.

Another pleasant benefit of segmenting the stacks is that it makes supporting coroutines and continuations much easier. It is much easier to save off part of a segmented stack. Combined with Parrot's copy-on-write features, this makes for efficient continuations and coroutines. It may not be a feature that many folks will use, but it's a pleasant fall-out from other things.

Interestingly, while Parrot's stacks look and act like stacks in all but the most extreme circumstances, they're really trees. Each subroutine (and potentially each block, as they're occasionally the same thing) gets a fresh stack frame, linked to the stack of its caller. Those stack frames will be cleaned up by the garbage collector when there are no outstanding references to them, though it's not guaranteed to happen immediately.

Strings

Text data is deceptively complex, so Parrot has strings as a fundamental data type. We do this out of sheer practicality. We know strings are complex and error-prone, so we implement them only once. All languages that target Parrot can share the same implementation, and don't have to make their own mistakes.

The big problem with text is the vast number of human languages and the variety of conventions around the world for dealing with it. Long ago, 7-bit ASCII with 127 characters was sufficient. Computers were limited and mostly used in English, regardless of the user's native language. These heavy restrictions were acceptable because the machines of the day were so limited that any other option was too slow. Also, most people using computers at the time were fluent in English either as their native language or a comfortable second language.

That day passed quite a few years ago. Many different ways of representing text have sprung up, from the various multibyte Japanese and Chinese representations--designed for languages with many thousands of characters--to a half dozen or so European representations, which take only a byte but disagree on what characters fit into that byte. The Unicode consortium has been working for years on the Unicode standard to try and unify all the different schemes, but full unification is still years away, if it ever happens.

In the abstract, strings are a series of integers with meaning attached to them, but getting from real-world data to abstract integers isn't as simple as you might want. There are three important things associated with string data--encoding, character set, and language--and Parrot's string system knows how to deal with them.

A string's encoding says how to turn data from a stream of bytes to a stream of characters represented by integers. Something like ASCII data is simple to deal with, since each character is a single byte, and characters range in value from 0 to 255. UTF-8, one of the Unicode encodings, is more complex--a single character can take anywhere from 1 to 6 bytes.

The character set for a string tells Parrot what each of the integers actually represents. Parrot won't get too far if it doesn't know that 65 is a capital "A" in an ASCII or Unicode character stream, for example.

Finally, the language for a string determines how the string behaves in some contexts. Different languages have different rules for sorting and case-folding characters. Whether an accented character keeps its accent when upper-cased or lowercased depends on the language that the string came from.

The capability of translating strings from one encoding to another and one character set to another, and to determine when it's needed, is built into Parrot. The I/O and regular expression systems fully exploit Parrot's core string capabilities, so any language that uses Parrot's built-in string functionality gets this for free. Since properly implementing even a single system like Unicode is fraught with peril, this makes the job of people writing languages that target Parrot (including Perl 6) much easier.

While Parrot provides these facilities, languages aren't required to make use of them. Perl 6, for example, generally mandates that all strings will be treated as if they are Unicode. In this case Parrot's multi-lingual capabilities mainly act as filters to translate to and from Unicode. Parrot presents all the data as if it were Unicode, but only translates non-Unicode data to Unicode in situations where your program may notice.

Unicode is Parrot's character set of last resort when it needs one. We use IBM's ICU Unicode library to do all the heavy lifting, since writing a properly done Unicode library is a non-trivial undertaking. It makes more sense to use a well-tested and debugged library than it does to try and reimplement Unicode again.

Variables

Variables are a fundamental construct in almost all computer languages.With the exception of functional languages, though they can be useful there as well. With low-level languages such as C, variables are straightforward--they are either basic hardware constructs like a 32-bit integer, a 64-bit IEEE floating-point number, or the address of some location in memory, or they're a structure containing basic hardware constructs. Exchanging variables between low-level languages is simple because all the languages operate on essentially the same things.

Once you get to higher-level languages, variables get more interesting. OO languages have the concept of the object as a fundamental construct, but no two OO languages seem to agree on exactly how objects should behave or how they should be implemented. Then there are higher-level languages like Perl, with complex constructs like hashes, arrays, and polymorphic scalars as fundamental constructs.

The first big issue that Parrot had to face was implementing these constructs. The second was doing it in a way that allowed Perl code to use Ruby objects, Ruby code to use Python objects, and Lisp code to use both. Parrot's solution is the PMC, or Parrot Magic Cookie.

A PMC is an abstract variable and a base data type--the same way that integers and floating-point numbers are base data types for hardware CPUs. The languages we're working to support--Perl, Python, and Ruby--have base variables that are far more complex than just an integer or floating-point number. If we want them to exchange any sort of real data, they must have a common base variable type. Parrot provides that with the PMC construct. Each language can build on this common base. More importantly, each language can make sure that their variables behave properly regardless of which language is using them.

When you think about it, there is a large list of things that a variable should be able to do. You should, for example, be able to load or store a value, add or subtract it from another variable, call a method or set a property on it, get its integer or floating-point representation, and so on. What we did was make a list of these functions and make them mandatory.

Each PMC has a vtable (short for "virtual table") attached to it. This table of function pointers is fixed--the list of functions, and where they are in the table, is the same for each PMC. All the common operations a program might perform on a variable--as well as all the operators that might be overloaded for a PMC--have vtable entries.

Bytecode

Like any CPU, software, or hardware, Parrot needs a set of instructions to tell it what to do. For hardware, this is a stream of executable code or machine language. For Parrot, this is bytecode. Calling it bytecode isn't strictly accurate, since the individual instructions are 32 bits each rather than 8 bits each, but since it's the common term for most other virtual machines, it's the term we use.

Each instruction--also known as an opcode--tells the interpreter engine what to do. Some opcodes are very low level, such as the one to add two integers together. Others are significantly more complex, like the opcode to take a continuation.

Parrot's bytecode is designed to be directly executable. The code on disk can be run by the interpreter without needing any translation. This gets us a number of benefits. Loading is much faster, of course, since we don't have to do much (if any) processing on the bytecode as it's loaded. It also means we can use some special OS calls that map a file directly into the memory space of a process. Because of the way this is handled by the operating system,Conveniently, this works the same way for all the flavors of Unix, Windows, and VMS. the bytecode file will be loaded into the system's memory only once, no matter how many processes use the file. This can save a significant amount of real RAM on server systems. Files loaded this way also get their parts loaded on demand. Since we don't need to process the bytecode in any way to execute it, if you map in a large bytecode library file, only those bits of the file your program actually executes will get read in from disk. This can save a lot of time.

Parrot creates bytecode in a format optimized for the platform it's built on, since the common case by far is executing bytecode that's been built on the system you're using. This means that floating-point numbers are stored in the current platform's native format, integers are in the native size, and both are stored in the byte order for the current platform. Parrot does have the capability of executing bytecode that uses 32-bit integers and IEEE floating-point numbers on any platform, so you can build and ship bytecode that can be run by anyone with a Parrot interpreter.

If you do use a bytecode file that doesn't match the current platform's requirements (perhaps the integers are a different size), Parrot automatically translates the bytecode file as it reads it in. In this case, Parrot does have to read in the entire file and process it. The sharing and load speed benefits are lost, but it's a small price to pay for the portability. Parrot ships with a utility to turn a portable bytecode file into a native format bytecode file if the overhead is too onerous.

I/O, Events, and Threads

Parrot has comprehensive support for I/O, threads, and events. These three systems are interrelated, so we'll treat them together. The systems we talk about in this section are less mature than other parts of the engine, so they may change by the time we roll out the final design and implementation.

I/O

Parrot's base I/O system is fully asynchronous I/O with callbacks and per-request private data. Since this is massive overkill in many cases, we have a plain vanilla synchronous I/O layer that your programs can use if they don't need the extra power.

Asynchronous I/O is conceptually pretty simple. Your program makes an I/O request. The system takes that request and returns control to your program, which keeps running. Meanwhile the system works on satisfying the I/O request. When the request is satisfied, the system notifies your program in some way. Since there can be multiple requests outstanding, and you can't be sure exactly what your program will be doing when a request is satisfied, programs that make use of asynchronous I/O can be complex.

Synchronous I/O is even simpler. Your program makes a request to the system and then waits until that request is done. There can be only one request in process at a time, and you always know what you're doing (waiting) while the request is being processed. It makes your program much simpler, since you don't have to do any sort of coordination or synchronization.

The big benefit of asynchronous I/O systems is that they generally have a much higher throughput than a synchronous system. They move data around much faster--in some cases three or four times faster. This is because the system can be busy moving data to or from disk while your program is busy processing data that it got from a previous request.

For disk devices, having multiple outstanding requests--especially on a busy system--allows the system to order read and write requests to take better advantage of the underlying hardware. For example, many disk devices have built-in track buffers. No matter how small a request you make to the drive, it always reads a full track. With synchronous I/O, if your program makes two small requests to the same track, and they're separated by a request for some other data, the disk will have to read the full track twice. With asynchronous I/O, on the other hand, the disk may be able to read the track just once, and satisfy the second request from the track buffer.

Parrot's I/O system revolves around a request. A request has three parts: a buffer for data, a completion routine, and a piece of data private to the request. Your program issues the request, then goes about its business. When the request is completed, Parrot will call the completion routine, passing it the request that just finished. The completion routine extracts out the buffer and the private data, and does whatever it needs to do to handle the request. If your request doesn't have a completion routine, then your program will have to explicitly check to see if the request was satisfied.

Your program can choose to sleep and wait for the request to finish, essentially blocking. Parrot will continue to process events while your program is waiting, so it isn't completely unresponsive. This is how Parrot implements synchronous I/O--it issues the asynchronous request, then immediately waits for that request to complete.

The reason we made Parrot's I/O system asynchronous by default was sheer pragmatism. Network I/O is all asynchronous, as is GUI programming, so we knew we had to deal with asynchrony in some form. It's also far easier to make an asynchronous system pretend to be synchronous than it is the other way around. We could have decided to treat GUI events, network I/O, and file I/O all separately, but there are plenty of systems around that demonstrate what a bad idea that is.

Events

An event is a notification that something has happened: the user has manipulated a GUI element, an I/O request has completed, a signal has been triggered, or a timer has expired. Most systems these days have an event handler,Often two or three, which is something of a problem. because handling events is so fundamental to modern GUI programming. Unfortunately, the event handling system is not integrated, or poorly integrated, with the I/O system, leading to nasty code and unpleasant workarounds to try and make a program responsive to network, file, and GUI events simultaneously. Parrot presents a unified event handling system, integrated with its I/O system, which makes it possible to write cross-platform programs that work well in a complex environment.

Parrot's events are fairly simple. An event has an event type, some event data, an event handler, and a priority. Each thread has an event queue, and when an event happens it's put into the right thread's queue (or the default thread queue in those cases where we can't tell which thread an event was destined for) to wait for something to process it.

Any operation that would potentially block drains the event queue while it waits, as do a number of the cleanup opcodes that Parrot uses to tidy up on scope exit. Parrot doesn't check each opcode for an outstanding event for pure performance reasons, as that check gets expensive quickly. Still, Parrot generally ensures timely event handling, and events shouldn't sit in a queue for more than a few milliseconds unless event handling has been explicitly disabled.

When Parrot does extract an event from the event queue, it calls that event's event handler, if it has one. If an event doesn't have a handler, Parrot instead looks for a generic handler for the event type and calls it instead. If for some reason there's no handler for the event type, Parrot falls back to the generic event handler, which throws an exception when it gets an event it doesn't know how to handle. You can override the generic event handler if you want Parrot to do something else with unhandled events, perhaps silently discarding them instead.

Because events are handled in mainline code, they don't have the restrictions commonly associated with interrupt-level code. It's safe and acceptable for an event handler to throw an exception, allocate memory, or manipulate thread or global state safely. Event handlers can even acquire locks if they need to, though it's not a good idea to have an event handler blocking on lock acquisition.

Parrot uses the priority on events for two purposes. First, the priority is used to order the events in the event queue. Events for a particular priority are handled in a FIFO manner, but higher-priority events are always handled before lower-priority events. Parrot also allows a user program or event handler to set a minimum event priority that it will handle. If an event with a priority lower than the current minimum arrives, it won't be handled, instead sitting in the queue until the minimum priority level is dropped. This allows an event handler that's dealing with a high-priority event to ignore lower-priority events.

User code generally doesn't need to deal with prioritized events, so programmers should adjust event priorities with care. Adjusting the default priority of an event, or adjusting the current minimum priority level, is a rare occurrence. It's almost always a mistake to change them, but the capability is there for those rare occasions where it's the correct thing to do.

Signals

Signals are a special form of event, based on the Unix signal mechanism. Parrot presents them as mildly special, as a remnant of Perl's Unix heritage, but under the hood they're not treated any differently from any other event.

The Unix signaling mechanism is something of a mash, having been extended and worked on over the years by a small legion of undergrad programmers. At this point, signals can be divided into two categories, those that are fatal, and those that aren't.

Fatal signals are things like SIGKILL, which unconditionally kills a process, or SIGSEGV, which indicates that the process has tried to access memory that isn't part of your process. There's no good way for Parrot to catch these signals, so they remain fatal and will kill your process. On some systems it's possible to catch some of the fatal signals, but Parrot code itself operates at too high a level for a user program to do anything with them--they must be handled with special-purpose code written in C or some other low-level language. Parrot itself may catch them in special circumstances for its own use, but that's an implementation detail that isn't exposed to a user program.

Non-fatal signals are things like SIGCHLD, indicating that a child process has died, or SIGINT, indicating that the user has hit ^C on the keyboard. Parrot turns these signals into events and puts them in the event queue. Your program's event handler for the signal will be called as soon as Parrot gets to the event in the queue, and your code can do what it needs to with it.

SIGALRM, the timer expiration signal, is treated specially by Parrot. Generated by an expiring alarm() system call, this signal is normally used to provide timeouts for system calls that would otherwise block forever, which is very useful. The big downside to this is that on most systems there can only be one outstanding alarm() request, and while you can get around this somewhat with the setitimer call (which allows up to three pending alarms) it's still quite limited.

Since Parrot's IO system is fully asynchronous and never blocks--even what looks like a blocking request still drains the event queue--the alarm signal isn't needed for this. Parrot instead grabs SIGALRM for its own use, and provides a fully generic timer system which allows any number of timer events, each with their own callback functions and private data, to be outstanding.

Threads

Threads are a means of splitting a process into multiple pieces that execute simultaneously. It's a relatively easy way to get some parallelism without too much work. Threads don't solve all the parallelism problems your program may have. Sometimes multiple processes on a single system, multiple processes on a cluster, or processes on multiple separate systems are better. But threads do present a good solution for many common cases.

All the resources in a threaded process are shared between threads. This is simultaneously the great strength and great weakness of threads. Easy sharing is fast sharing, making it far faster to exchange data between threads or access shared global data than to share data between processes on a single system or on multiple systems. Easy sharing is dangerous, though, since without some sort of coordination between threads it's easy to corrupt that shared data. And, because all the threads are contained within a single process, if any one of them fails for some reason the entire process, with all its threads, dies.

With a low-level language such as C, these issues are manageable. The core data types, integers, floats, and pointers are all small enough to be handled atomically. Composite data can be protected with mutexes, special structures that a thread can get exclusive access to. The composite data elements that need protecting can each have a mutex associated with them, and when a thread needs to touch the data it just acquires the mutex first. By default there's very little data that must be shared between threads, so it's relatively easy, barring program errors, to write thread-safe code if a little thought is given to the program structure.

Things aren't this easy for Parrot, unfortunately. A PMC, Parrot's native data type, is a complex structure, so we can't count on the hardware to provide us atomic access. That means Parrot has to provide atomicity itself, which is expensive. Getting and releasing a mutex isn't really that expensive in itself. It has been heavily optimized by platform vendors because they want threaded code to run quickly. It's not free, though, and when you consider that running flat-out Parrot does one PMC operation per 100 CPU cycles, even adding an additional 10 cycles per operation can slow Parrot down by 10%.

For any threading scheme, it's important that your program isn't hindered by the platform and libraries it uses. This is a common problem with writing threaded code in C, for example. Many libraries you might use aren't thread-safe, and if you aren't careful with them your program will crash. While we can't make low-level libraries any safer, we can make sure that Parrot itself won't be a danger. There is very little data shared between Parrot interpreters and threads, and access to all the shared data is done with coordinating mutexes. This is invisible to your program, and just makes sure that Parrot itself is thread-safe.

When you think about it, there are really three different threading models. In the first one, multiple threads have no interaction among themselves. This essentially does with threads the same thing that's done with processes. This works very well in Parrot, with the isolation between interpreters helping to reduce the overhead of this scheme. There's no possibility of data sharing at the user level, so there's no need to lock anything.

In the second threading model, multiple threads run and pass messages back and forth between each other. Parrot supports this as well, via the event mechanism. The event queues are thread-safe, so one thread can safely inject an event into another thread's event queue. This is similar to a multiple-process model of programming, except that communication between threads is much faster, and it's easier to pass around structured data.

In the third threading model, multiple threads run and share data between themselves. While Parrot can't guarantee that data at the user level remains consistent, it can make sure that access to shared data is at least safe. We do this with two mechanisms.

First, Parrot presents an advisory lock system to user code. Any piece of user code running in a thread can lock a variable. Any attempt to lock a variable that another thread has locked will block until the lock is released. Locking a variable only blocks other lock attempts. It does not block plain access. This may seem odd, but it's the same scheme used by threading systems that obey the POSIX thread standard, and has been well tested in practice.

Secondly, Parrot forces all shared PMCs to be marked as such, and all access to shared PMCs must first acquire that PMC's private lock. This is done by installing an alternate vtable for shared PMCs, one that acquires locks on all its parameters. These locks are held only for the duration of the vtable function, but ensure that the PMCs affected by the operation aren't altered by another thread while the vtable function is in progress.

Objects

Perl 5, Perl 6, Python, and Ruby are all object-oriented languages in some form or other, so Parrot has to have core support for objects and classes. Unfortunately, all these languages have somewhat different object systems, which made the design of Parrot's object system somewhat tricky.As I write this it's still in progress, though it should be done by the time this book is in print. It turns out that if you draw the abstraction lines in the right places, support for the different systems is easily possible. This is especially true if you provide core support for things like method dispatch that the different object systems can use and override.

Generic Object Interfacing

Parrot's object system is very simple--in fact, a PMC only has to handle method calls to be considered an object. Just handling methods covers well over 90% of the object functionality that most programs use, since the vast majority of object access is via method calls. This means that user code that does the following:

  object = some_constructor(1, 2, "foo");  
  object.bar(12);

will work just fine, no matter what language the class that backs object is written in, if object even has a class backing it. It could be Perl 5, Perl 6, Python, Ruby, or even Java, C#, or Common Lisp; it doesn't matter.

Objects may override other functionality as well. For example, Python objects use the basic PMC property mechanism to implement object attributes. Both Python and Perl 6 mandate that methods and properties share the same namespace, with methods overriding properties of the same name.

Parrot Objects

When we refer to Parrot objects we're really talking about Parrot's default base object system. Any PMC can have methods called on it and act as an object, and Parrot is sufficiently flexible to allow for alternate object systems, such as the one Perl 5 uses. In this section, though, we're talking about what we provide in our standard object system. Parrot's standard object system is pretty traditional--it's a class-based system with multiple inheritance, interface declarations, and slot-based objects.

Each object is a member of a class, which defines how the object behaves. Each class in an object's hierarchy can have one or more attributes--that is, named slots that are guaranteed to be in each object of that class. The names are all class-private so there's no chance of collision. Objects are essentially little fixed-sized arrays that know what class they belong to. Most of the "smarts" for an object lives in that object's class. Parrot allows you to add attributes at run time to a class. If you do, then all objects with that class in their inheritance hierarchy will get the new attribute added into it. While this is potentially expensive it's a very useful feature for languages that may extend a class at run time.

Parrot uses a multiple inheritance scheme for classes. Each class can have two or more parent classes, and each of those classes can have multiple parents. A class has control over how methods are searched for, but the default search is a left-most, depth-first search, the same way that Perl 5 does it. Individual class implementers may change this if they wish, but only the class an object is instantiated into controls the search order. Parrot also fully supports correct method redispatch, so a method may properly call the next method in the hierarchy even in the face of multiple parents. One limitation we place on inheritance is that a class is only instantiated in the hierarchy once, no matter how many times it appears in class and parent class inheritance lists.

Each class has its own vtable, which all objects of that class share. This means that with the right vtable methods every object can behave like a basic PMC type in addition to an object. For unary operations such as load or store, the default class vtable first looks for the appropriately named method in the class hierarchy. For binary operators such as addition and subtraction, it first looks in the multimethod dispatch table. This is only the default, and individual languages may make different choices. Objects that implement the proper methods can also act as arrays or hashes.

Finally, Parrot implements an interface declaration scheme. You may declare that a class does one or more named interfaces, and later query objects at run time to see if they implement an interface. This doesn't put any methods in a class. For that you need to either inherit from a class that does or implement them by hand. All it does is make a declaration of what your class does. Interface declarations are inheritable as well, so if one of your parent classes declares that it implements an interface then your class will as well. This is used in part to implement Perl 6's roles.

Mixed Class-Type Support

The final piece of Parrot's object system is the support for inheriting from classes of different types. This could be a Perl 6 class inheriting from a Perl 5 class, or a Ruby class inheriting from a .NET class. It could even involve inheriting from a fully compiled language such as C++ or Objective C, if proper wrapping is established.Objective C is particularly simple, as it has a fully introspective class system that allows for run-time class creation. Inheritance can go both ways between it and Parrot. As we talked about earlier, as long as a class either descends from the base Parrot class or has a small number of required properties, Parrot can subclass it. This potentially goes both ways, as any class system that knows how to subclass from Parrot's base class can inherit from it.

Allowing classes to inherit from other classes of a different base type does present some interesting technical issues. The inheritance isn't 100% invisible, though you have to head off into the corner cases to find the cracks. It's an important feature to design into Parrot, so we can subclass Perl 5 style classes, and they can subclass Parrot classes. Being able to subclass C++ and Objective C classes is a potential bonus. Python, Ruby, and Perl 6 all share a common (but hidden) base class in Parrot's base object type, so they can inherit from each other without difficulty.

Advanced Features

Since the languages Parrot targets (like Perl and Ruby) have sophisticated concepts as core features, it's in Parrot's best interest to have core support for them. This section covers some (but not all) of these features.

Garbage Collection

It's expected that modern languages have garbage collection built in. The programmer shouldn't have to worry about explicitly cleaning up after dead variables, or even identifying them. For interpreted languages, this requires support from the interpreter engine, so Parrot provides that support.

Parrot has two separate allocation systems built into it. Each allocation system has its own garbage collection scheme. Parrot also has some strict rules over what can be referenced and from where. This allows it to have a more efficient garbage collection system.

The first allocation system is responsible for PMC and string structures. These are fixed-sized objects that Parrot allocates out of arenas, which are pools of identically sized things. Using arenas makes it easy for Parrot to find and track them, and speeds up the detection of dead objects.

Parrot's dead object detection system works by first running through all the arenas and marking all strings and PMCs as dead. It then runs through the stacks and registers, marking all strings and PMCs they reference as alive. Next, it iteratively runs through all the live PMCs and strings and marks everything they reference as alive. Finally, it sweeps through all the arenas looking for newly dead PMCs and strings, which it puts on the free list. At this point, any PMC that has a custom destruction routine, such as an object with a DESTROY method, has its destruction routine called. The dead object detector is triggered whenever Parrot runs out of free objects, and can be explicitly triggered by running code. Often a language compiler will force a dead object sweep when leaving a block or subroutine.

Parrot's memory allocation system is used to allocate space for the contents of strings and PMCs. Allocations don't have a fixed size; they come from pools of memory that Parrot maintains. Whenever Parrot runs out of memory in its memory pools, it makes a compacting run--squeezing out unused sections from the pools. When it's done, one end of each pool is entirely actively used memory, and the other end is one single chunk of free memory. This makes allocating memory from the pools faster, as there's no need to walk a free list looking for a segment of memory large enough to satisfy the request for memory. It also makes more efficient use of memory, as there's less overhead than in a traditional memory allocation system.

Splitting memory pool compaction from dead object detection has a nice performance benefit for Perl and languages like it. For most Perl programs, the interpreter allocates and reallocates far more memory for string and variable contents than it does actual string and variable structures. The structures are reused over and over as their contents change. With a traditional single-collector system, each time the interpreter runs out of memory it has to do a full scan for dead objects and compact the pools after. With a split system, Parrot can just sweep through the variables it thinks are live and compact their contents. This does mean that Parrot will sometimes move data for variables and strings that are really dead because it hasn't found that out yet. That expense is normally much less than the expense of doing a full tracing run to find out which variables are actually dead.

Parrot's allocation and collection systems have some compromises that make interfacing with low-level code easier. The structure that describes a PMC or string is guaranteed not to move over the lifetime of the string or variable. This allows C code to store pointers to variables in internal structures without worrying that what they're referencing may move. It also means that the garbage collection system doesn't have to worry about updating pointers that C code might hold, which it would have to do if PMC or string structures could move.

Multimethod Dispatching

Multimethod dispatching (also known as signature-based dispatching) is a powerful technique that uses the parameters of a function or method call to help decide at run time what function or method Parrot should call. This is one of the new features being built into Perl 6. It allows you to have two or more subroutines or methods with the same name that differ only in the types of their arguments.

In a standard dispatch system, each subroutine or method name must be unique within a namespace. Attempting to create a second routine with the same name either throws an error or overlays the original one. This is certainly straightforward, but in some circumstances it leads to code that looks like:

  sub foo {
      my ($self, $arg) = @_;
      if ($arg->isa("Foo")) {
          # Do something with a Foo arg
      } elsif ($arg->isa("Bar")) {
          # Do something with a Bar arg
      } elsif ($arg->isa("Baz")) {
          # Do something with a Baz arg
      } else {
          #...
      }
  }

This method effectively dispatches both on the type of the object and on the type of the argument to the method. This sort of thing is common, especially in operator overloading functions. Manually checking the types of the arguments to select an action is both error-prone and difficult to extend. Multimethod dispatch solves this problem.

With multimethod dispatch, there can be more than one method or subroutine with the same name as long as each variant has different parameters in its declaration. When code calls a method or subroutine that participates in multiple dispatch, the system chooses the variant that most closely matches the types of the parameters in the call.

One very notable thing about subs and methods that do multimethod dispatch is that the named subroutines and methods live outside of any namespace. By default, when searching for a method or subroutine, Parrot first looks for an explict sub or method of that name in the current namespace (or the inheritance hierarchy of an object), then for the default subroutine or method (AUTOLOAD or its equivalent) in the inheritance hierarchy, and only when those fail will it look for a multimethod dispatch version of the subroutine or method. Since Parrot allows individual PMC classes to control how their dispatching is done, this sequence may be changed on a per-class basis if need be.

Parrot itself makes heavy use of multimethod dispatch, with most of the core PMC classes using it to provide operator overloading. The only reason we don't use it for all our operator dispatching is that some of the languages we're interested in require a left-side wins scheme. It's so heavily used for operator overloading, in fact, that we actually have two separate versions of multiple dispatch built into Parrot, one specially tailored to operator overloading and a more general version for normal subroutine and method dispatch.

Continuations

Continuations are possibly the most powerful high-level flow control construct. Originating with lambda calculus, and built into Lisp over thirty years ago, continuations can be thought of as a closure for control flow. They not only capture their lexical scope, which gets restored when they're invoked, but also capture their call stack, so when they're invoked it's as if you never left the spot where they were created. Like closures, though, while they capture the variables in scope when the continuation is taken, they don't capture the values of the variables. When you invoke a continuation it's not like rolling back a transaction.

Continuations are phenomenally powerful, and have the undeserved reputation of being bizarre and mind-warping things. This turns out not to be the case. Originally we put continuations into Parrot to support Ruby, which has them. This decision turned out to be fortuitous.

In a simple call/return system, which many languages use, when you make a subroutine call the return address is pushed onto a stack somewhere. When the subroutine is done it takes the address off the stack and returns there. This is a simple and straightforward operation, and quite fast. The one disadvantage is that with a secure system the calling routine needs to preserve any information that is important before making the call and restore it on return.

An alternative calling scheme is called Continuation Passing Style, or CPS. With CPS, rather than pushing a return address onto the stack you create a return continuation and pass that into the subroutine as a parameter. When the subroutine is done it invokes the return continuation, effectively returning to the caller with the caller's environment automatically restored. This includes not only things like the call stack and lexical variables, but also meta-information like security credentials.

When we were originally designing Parrot we'd planned on the simpler call/return style, with the caller preserving everything important before the call, and restoring it afterwards. Three things soon became clear: we were saving and restoring a lot of individual pieces; we were going to have to add new pieces in the future; and there wasn't any difference between what we were doing for a call and what we were doing for a continuation, except that the call was a lot more manual.

The future-proofing was what finally made the decision. Parrot is making a strong guarantee of backwards compatibility, which means that code compiled to Parrot bytecode once we've released will run safely and unchanged on all future version of Parrot. If we require all the individual pieces of the environment (registers, lexical pads, nested namespaces, opcode libraries, stack pointers, exception handlers, and assorted things) to be saved manually for a subroutine call, it means that we can't add any new pieces in the future, as then old code would no longer work properly. We briefly toyed with the idea of an opcode to package up the entire environment in one go. Then we realized that package was a continuation, and as such we might as well just go all the way and use them.

As a result, Parrot implements a full CPS system internally, and uses it for all subroutine and method calls. We also have the simpler call/return style of flow control available for languages that don't need the heavier-weight call system, as well as for compilers to use for internal processing and optimization. We do go to some lengths to hide the continuations. PIR code, for example, allows compiler writers to create subroutines and methods (and calls to them) that conform to Parrot's CPS mechanism without ever touching continuations directly. We then have the benefits of what appears to be a simple calling scheme, secure future-proofing, and the full power of continuations for languages that want them.

Coroutines

A coroutine is a subroutine or method that can suspend itself partway through, then later pick up where it left off. This isn't quite the same thing as a continuation, though it may seem so at first. Coroutines are often used to implement iterators and generators, as well as threads on systems that don't have native threading support. Since they are so useful, and since Perl 6 and Python provide them either directly or as generators, Parrot has support for them built in.

Coroutines present some interesting technical challenges. Calling into an existing coroutine requires reestablishing not only the lexical state and potentially the hypothetical state of variables, but also the control state for just the routine. In the presence of exceptions they're a bit more complex than plain subroutines and continuations, but they're still very useful things and as such we've full support for them.

Conclusion

We've touched on much of Parrot's core functionality, but certainly not all. Hopefully we've given you enough of a feel for how Parrot works to expand your knowledge with the Parrot documentation and source.

41 POD Errors

The following errors were encountered while parsing the POD:

Around line 3:

Unknown directive: =head0

Around line 5:

A non-empty Z<>

Around line 15:

A non-empty Z<>

Around line 74:

A non-empty Z<>

Around line 76:

Deleting unknown formatting code A<>

Around line 115:

A non-empty Z<>

Around line 137:

Deleting unknown formatting code N<>

Around line 167:

A non-empty Z<>

Around line 183:

Deleting unknown formatting code N<>

Deleting unknown formatting code A<>

Around line 194:

A non-empty Z<>

Around line 256:

A non-empty Z<>

Around line 273:

A non-empty Z<>

Around line 287:

Deleting unknown formatting code N<>

Around line 312:

A non-empty Z<>

Around line 336:

Deleting unknown formatting code N<>

Around line 357:

A non-empty Z<>

Around line 380:

A non-empty Z<>

Around line 392:

Deleting unknown formatting code N<>

Around line 452:

A non-empty Z<>

Around line 532:

A non-empty Z<>

Around line 534:

Deleting unknown formatting code N<>

Around line 584:

A non-empty Z<>

Around line 598:

Deleting unknown formatting code N<>

Around line 638:

A non-empty Z<>

Around line 648:

A non-empty Z<>

Around line 716:

A non-empty Z<>

Around line 718:

Deleting unknown formatting code N<>

Around line 784:

A non-empty Z<>

Around line 836:

A non-empty Z<>

Around line 931:

A non-empty Z<>

Around line 933:

Deleting unknown formatting code N<>

Around line 947:

A non-empty Z<>

Around line 972:

A non-empty Z<>

Around line 1034:

A non-empty Z<>

Around line 1036:

Deleting unknown formatting code N<>

Around line 1064:

A non-empty Z<>

Around line 1073:

A non-empty Z<>

Around line 1148:

A non-empty Z<>

Around line 1211:

A non-empty Z<>

Around line 1283:

A non-empty Z<>

Around line 1305:

A non-empty Z<>