The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Opcodes and Runcores

The smallest executable component is not the compilation unit or even the subroutine, but is actually the opcode. Opcodes in PASM, like opcodes in other assembly languages, are individual instructions that implement low-level operations in Parrot In the world of microprocessors, the word "opcode" typically refers to the numeric identifier for each instructions. The human-readable word used in the associated assembly language is called the "mnemonic". An assembler, among other tasks, is responsible for converting mnemonics into opcodes for execution. In Parrot, instead of referring to an instruction by different names depending on what form it's in, we just call them all "opcodes". Of course the list of things that qualify as "low-level" in Parrot can be pretty advanced compared to the functionality supplied by regular assembly language opcodes.

Before we talk about opcodes, we have to a little bit of talking about the various runcores that invoke them.

Runcores

During execution, the runcore is like the heart of Parrot. The runcore controls calling the various opcodes with the correct data, and making sure that program flow moves properly. Some runcores, such as the precomputed C goto runcore are optimized for speed and don't perform many tasks beyond finding and dispatching opcodes. Other runcores, such as the GC-Debug, debug and profiling runcores help with typical software maintenance and analysis tasks. We'll talk about all of these throughout the chapter.

Runcores must pass execution to each opcode in the incoming bytecode stream. This is called dispatching the opcodes. Because the different runcores are structured in different ways, the opcodes themselves must be formated differently. The opcode compiler compiles opcodes into a number of separate formats, depending on what runcores are included in the compiled Parrot. Because of this, understanding opcodes first requires an understanding of the Parrot runcores.

Types of Runcores

Parrot has multiple runcores. Some are useful for particular maintenance tasks, some are only available as optimizations in certain compilers, some are intended for general use, and some are just interesing flights of fancy with no practical benefits. Here we list the various runcores, their uses, and their benefits.

    =item* Slow Core

    The slow core is a basic runcore design that treats each opcode as a separate function at the C level. Each function is called, and returns the address of the next opcode to be called by the core. The slow core performs bounds checking to ensure that the next opcode to be called is properly in bounds, and not somewhere random in memory. Because of this modular approach where opcodes are treated as separate executable entities many other runcores, especially diagnostic and maintenance cores are based on this design. The program counter pc is the current index into the bytecode stream. Here is a pseudocode representation for how the slow core works:

      while(1) {
          pc = NEXT_OPCODE;
              if(pc < LOW_BOUND || pc > HIGH_BOUND)
                  throw exception;
              DISPATCH_OPCODE(pc);
              UPDATE_INTERPRETER();
      }

    =item* Fast Core

    The fast core is a bare-bones core that doesn't do any of the bounds-checking or context updating that the slow core does. The fast core is the way Parrot should run, and is used to find and debug places where execution strays outside of it's normal bounds. In pseudocode, the fast core is very much like the slow core except it doesn't do the bounds checking between each instruction, and doesn't update the interpreter's current context for each dispatch.

      while(1) {
          pc = NEXT_OPCODE;
              DISPATCH_OPCODE(pc);
      }

    =item* Switch Core

    As it's name implies, the switch core uses a gigantic C switch / case structure to execute opcodes. Here's a brief example of how this architecture works:

      for( ; ; current_opcode++) {
          switch(*current_opcode) {
                  case opcode_1:
                          ...
                  case opcode_2:
                          ...
                      case opcode_3:
                          ...
              }
      }

    This is quite a fast architecture for dispatching opcodes because it all happens within a single function. The only operations performed between opcodes is a jump back to the top of the loop, incrementing the opcode pointer, dereferencing the opcode pointer, and then a jump to the case statement for the next opcode.

    =item* Computed Goto Core

    Computed Goto is a feature of some C compilers where a label is treated as a piece of data that can be stored in an array. Each opcode is simply a label in a very large function, and the labels are stored in an array. Calling an opcode is as easy as taking that opcode's number as the index of the label array, and calling the associated label. Sound complicated? It is a little, especially to C programmers who are not used to these kinds of features, and who have been taught that the goto keyword is to be avoided.

    As was mentioned earlier, not all compilers support computed goto, which means that this core will not be built on platforms that don't support it. However, it's still an interesting topic to study so we will look at it briefly here. For compilers that support it, computed goto labels are void ** values. In the computed goto core, all the labels represent different opcodes, so they are stored in an array:

      void *my_labels[] = {
          &&label1,
              &&label2,
              &&label3
      };
      
      label1:
          ...
      label2:
          ...
      label3:
          ...

    Jumping to one of these labels is done with a command like this:

      goto *my_labels[opcode_number];

    Actually, opcodes are pointed to by an opcode_t * pointer, and all opcodes are stored sequentially in memory, so the actual jump in the computed goto core must increment the pointer and then jump to the new version. In C it looks something like this:

      goto *my_labels[*(current_opcode += 1)];

    Each opcode is an index into the array of labels, and at the end of each opcode an instruction like this is performed to move to the next opcode in series, or else some kind of control flow occurs that moves it to a non-sequential location:

      goto *my_lables[*(current_opcode = destination)];

    These are simplifications on what really happens in this core, because the actual code has been optimized quite a bit from what has been presented here. However, as we shall see with the precomputed goto core, it isn't optimized as aggressively as is possible.

    =item* Precomputed Goto Core

    The precomputed goto core is an amazingly optimized fast core that uses the same computed goto feature, but performs the array dereferencing before the core even starts. In the computed goto core, you have this operation to move to the next opcode:

      goto *my_labels[*(current_opcode += 1)];

    This single line of code is deceptively complex. A number of machine code operations must be performed to complete this step: The value of current_opcode must be incremented to the next value, that value must be dereferenced to find the opcode value. In C, arrays are pointers, so my_labels gets dereferenced and an offset is taken according to find the stored label reference. That label reference is then dereferenced, and the jump is performed.

    That's a lot of steps to execute before we can jump to the next opcode. Now, what, if each opcode value was replaced with the value of the jump label beforehand? If current_opcode points to a label referenced already, we don't need the array at all. We can replace that entire mess above with this line:

      goto **(current_opcode += 1);

    That's far fewer machine instructions to execute before we can move to the next opcode, which means faster throughput. Remember that whatever dispatch mechanism is used will be called after every singly opcode, and some large programs may have millions of opcodes! Every single machine instruction that can be cut out of the dispatch mechanism could increase the execution speed of Parrot in a significant and noticable way. The precomputed goto core, while not available on all compilers, takes the idea of optimization to the extreme.

    =item* Tracing Core

    =item* Profiling Core

    The profiling core analyzes the performance of Parrot, and helps to determine where bottlenecks and trouble spots are in the programs that run on top of Parrot. When Parrot calls a PIR subroutine it sets up the environment, allocates storage for the passed parameters and the return values, passes the parameters, and calls a new runcore to execute it. To calculate the amount of time that each subroutine takes, we need to measure the amount of time spent in each runcore from the time the core begins to the time the core executes. The profiling core does exactly this, acting very similarly to a slow core but also measuring the amount of time it takes for the core to complete. The tracing core actually keeps track of a few additional values, including the number of GC cycles run while in the subroutine, the number of each opcode called and the number of calls to each subroutine made. All this information is helpfully printed to the STDERR output for later analysis.

    =item* GC Debug Core

    Parrot's garbage collector has been known as a weakness in the system for several years. In fact, the garbage collector and memory management subsystem was one of the last systems to be improved and rewritten before the release of version 1.0. It's not that garbage collection isn't important, but instead that it was so hard to do earlier in the project.

    Early on when the GC was such a weakness, and later when the GC was under active development, it was useful to have an operational mode that would really exercise the GC and find bugs that otherwise could hide by sheer chance. The GC debug runcore was this tool. The core executes a complete collection iteration between every single opcode. The throughput performance is terrible, but that's not the point: it's almost guaranteed to find problems in the memory system if they exist.

    =item* Debug Core

    The debug core works like a normal software debugger, such as GDB. The debug core executes each opcode, and then prompts the user to enter a command. These commands can be used to continue execution, step to the next opcode, or examine and manipulate data from the executing program.

Opcodes

Opcodes are the smallest logical execution element in Parrot. An individual opcode corresponds, in an abstract kind of way, with a single machine code instruction for a particular hardware processor architecture. The difference is that Parrot's opcodes can perform some very complex and high-level tasks. Also, Parrot's opcodes can be dynamically loaded in from a special library file called a dynop library. We'll talk about dynops a little bit later.

Opcode naming

To the PIR and PASM programmers, opcodes appear to be polymorphic. That is, some opcodes appear to have multiple argument formats. This is just an illusion, however. Parrot opcodes are not polymorphic, although certain features enable it to appear that way. Different argument list formats are detected during parsing and translated into separate, and unique, opcode names.

Opcode Multiple Dispatch

Writing Opcodes

Writing Opcodes, like writing PMCs, is done in a C-like language which is later compiled into C code by the opcode compiler. The opcode script represents a thin overlay on top of ordinary C code: All valid C code is valid opcode script. There are a few neat additions that make writing opcodes easier. This script is very similar to that used to define PMCs. The INTERP constant, for instance, is always available in the opcodes like they are in VTABLE and METHOD declarations. Unlike VTABLEs and METHODs, opcodes are defined with the op keyword.

Opcodes are written in files with the .ops extension. The core operation files are stored in the src/ops/ directory.

Opcode Parameters

Each opcode can take any fixed number of input and output arguments. These arguments can be any of the four primary data types--INTVALs, PMCs, NUMBERS and STRINGs--but can also be one of several other types of values including LABELs, KEYs and INTKEYs.

Each parameter can be an input, an output or both, using the in, out, and inout keywords respectively. Here is an example:

  op Foo (out INT, in NUM)

This opcode could be called like this:

  $I0 = Foo $N0     # in PIR syntax
  Foo $I0, $N0      # in PASM syntax

When Parrot parses through the file and sees the Foo operation, it converts it to the real name Foo_i_n. The real name of an opcode is it's name followed by an underscore-separated ordered list of the parameters to that opcode. This is how Parrot appears to use polymorphism: It translates the overloaded opcode common names into longer unique names depending on the parameter list of that opcode. Here is a list of some of the variants of the add opcode:

  add_i_i      # $I0 += $I1
  add_n_n      # $N0 += $N1
  add_p_p      # $P0 += $P1
  add_i_i_i    # $I0 = $I1 + $I2
  add_p_p_i    # $P0 = $P1 + $I0
  add_p_p_n    # $P0 = $P1 + $N0

This isn't a complete list, but you should get the picture. Each different combination of parameters translates to a different unique operation, and each operation is remarkably simple to implement. In some cases, Parrot can even use it's multi-method dispatch system to call opcodes which are heavily overloaded, or for which there is no exact fit but the parameters could be coerced into different types to complete the operation. For instance, attempting to add a STRING to a PMC might coerce the string into a numerical type first, and then dispatch to the add_p_p_n opcode. This is just an example, and the exact mechanisms may change as more opcodes are added or old ones are deleted.

Opcode Control Flow

Some opcodes have the ability to alter control flow of the program they are in. There are a number of control behaviors that can be implemented, such as an unconditional jump in the goto opcode, or a subroutine call in the call code, or the conditional behavior implemented by if.

At the end of each opcode you can call a goto operation to jump to the next opcode to execute. If no goto is performed, control flow will continue like normal to the next operation in the program. In this way, opcodes can easily manipulate control flow. Opcode script provides a number of keywords to alter control flow:

  • NEXT()

    If NEXT contains the address of the next opcode in memory. You don't need to call goto NEXT(), however, because the default behavior for all opcodes is to automatically jump to the next opcode in the program You can do this if you really want to, but it really wouldn't help you any. The NEXT keyword is frequently used in places like the invoke opcode to create a continuation to the next opcode to return to after the subroutine returns.

The Opcode Compiler

Dynops

########################################################################## # PSEUDOPOD LEGEND # # Interior Sequences # .................. # link anchor (source) # bold text # monospace text # E<> named character # file name # superscript # subscript # italicized text # L<> link to other manpage (see ) # firstterm # footnote # quoted text # replaceable item # text with non-breaking spaces # cited title for book, etc. # URL # a single index term of the form: # primary:sortas;secondary:sortas;tertiary:sortas;;ETC # where ETC is either (see term) or (see also term) # only primary term is required # link anchor (destination) # # Heads # ..... # head0 chapter title # head{1-4} section title (4 levels) # # Command Paragraphs (begin/end Blocks) # ..................................... # blockquote quotation # comment ignored text # caution admonition # epigraph quotation # example container # figure CAPTION figure # important admonition # note admonition # programlisting literal text # screen literal text # sidebar container # table html [CAPTION] table rendered in HTML # table picture [CAPTION] table rendered in plain text # tip admonition # warning admonition # # # Command Paragraphs (for Blocks) # ............................... # This structure will be used only for comments. For example: # # =for editor # Check my spelling on this. # =end # # This will be rendered as a visible comment in the final output # with a label at the top addressing it to "editor". The exception is # =for ignore which will always be ignored by the parser. # # Tables # ...... # A 2x2 table with top header row looks like this: # # =begin table An Example Table # =headrow # =row # =cell Header for first column (row 1, col 1) # =cell Header for 2nd column (row 1, col 2) # =bodyrows # =cell Cell for row 2, col 1 # =cell Cell for row 2, col 2 # =end table # # For more information on PSEUDOPOD, write to tools@oreilly.com ##########################################################################

# Local variables: # c-file-style: "parrot" # End: # vim: expandtab shiftwidth=4:

4 POD Errors

The following errors were encountered while parsing the POD:

Around line 5:

A non-empty Z<>

Around line 7:

Deleting unknown formatting code N<>

Around line 339:

Deleting unknown formatting code N<>

Around line 354:

Deleting unknown formatting code A<>

Deleting unknown formatting code G<>

Deleting unknown formatting code H<>

Deleting unknown formatting code A<>

Deleting unknown formatting code M<>

Deleting unknown formatting code N<>

Deleting unknown formatting code Q<>

Deleting unknown formatting code R<>

Deleting unknown formatting code T<>

Deleting unknown formatting code U<>

An empty L<>

An empty E<>