The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

TITLE

The Parrot Bytecode Format

Format of the Parrot bytecode

  0          1          2          3
  +----------+----------+----------+----------+
  | Wordsize | Byteorder|  Major   |  Minor   |
  +----------+----------+----------+----------+

Wordsize must be at least 4 (32-bit). Loader is responsible for transforming the file into the VM native wordsize on the fly. For performance, a utility should be provided to convert PBCs on disk if they cannot be recompiled.

Byteorder currently supports two values: (0-Little Endian, 1-Big Endian)

  4          5
  +----------+----------+----------+----------+
  |  Flags   | FloatType| Pad - For future use|
  +----------+----------+----------+----------+
  |           Pad - For future use            |
  +----------+----------+----------+----------+
  |           Pad - For future use            |
  +----------+----------+----------+----------+

  16 
  +----------+----------+----------+----------+
  |         Parrot Magic = 0x 13155a1         |
  +----------+----------+----------+----------+

Magic is stored in native byteorder. The loader uses the byteorder header to convert the Magic to verify. More specifically, ALL words (non-bytes) in the bytecode file are stored in native order, unless otherwise specified.

  20
  +----------+----------+----------+----------+
  |         Opcode Type (Perl = 0x5045524c)   |
  +----------+----------+----------+----------+

 
  For each segment:

  4, 4 + (4 + S0), 4 + (4 + S0) + (4 + S1)
  +----------+----------+----------+----------+
  |       Segment length in bytes (S)         |
  +----------+----------+----------+----------+
  |                                           |
  :        S bytes of segment content         :
  |                                           |
  +----------+----------+----------+----------+

Currently there are three segment types defined, and they must occur in precisely the order: FIXUP, CONSTANT TABLE, BYTE CODE. Every segment must be present, even if empty.

FIXUP SEGMENT

  << The format for the FIXUP segment is not yet defined. >>

CONSTANT TABLE SEGMENT

  0 (relative)
  +----------+----------+----------+----------+
  |            Constant Count (N)             |
  +----------+----------+----------+----------+

For each constant:

  +----------+----------+----------+----------+
  |             Constant Type (T)             |
  +----------+----------+----------+----------+
  |             Constant Size (S)             |
  +----------+----------+----------+----------+
  |                                           |
  |        S bytes of constant content        |
  :       appropriate for representing        :
  |              a value of type T            |
  |                                           |
  +----------+----------+----------+----------+

CONSTANTS

For integer constants:

  << integer constants are represented as manifest constants in
     the byte code stream currently, limiting them to 32 bit values. >>

For number constants (S is constant, and is equal to sizeof(FLOATVAL)):

  +----------+----------+----------+----------+
  |                                           |
  |             S' bytes of Data              |
  |                                           |
  +----------+----------+----------+----------+

where

  S' = S + (S % 4) ? (4 - (S % 4)) : 0

If S' > S, then the extra bytes are filled with zeros.

For string constants (S varies, and is the size of the particular string):

  4, 4 + (16 + S'0), 4 + (16 + S'0) + (16 + S'1)
  +----------+----------+----------+----------+
  |                   Flags                   |
  +----------+----------+----------+----------+
  |                  Encoding                 |
  +----------+----------+----------+----------+
  |                   Type                    |
  +----------+----------+----------+----------+
  |                  Size (S)                 |
  +----------+----------+----------+----------+
  |                                           |
  :             S' bytes of Data              :
  |                                           |
  +----------+----------+----------+----------+

where

  S' = S + (S % 4) ? (4 - (S % 4)) : 0

If S' > S, then the extra bytes are filled with zeros.

BYTE CODE SEGMENT

The pieces that can be found in the byte code segment are as follows:

  +----------+----------+----------+----------+
  |              Operation Code               |
  +----------+----------+----------+----------+

  +----------+----------+----------+----------+
  |             Register Argument             |
  +----------+----------+----------+----------+

  +----------+----------+----------+----------+
  |    Integer Argument (Manifest Constant)   |
  +----------+----------+----------+----------+

  +----------+----------+----------+----------+
  |   String Argument (Constant Table Index)  |
  +----------+----------+----------+----------+

  +----------+----------+----------+----------+
  |   Number Argument (Constant Table Index)  |
  +----------+----------+----------+----------+

The number and types for each argument can be determined by consulting Parrot::Opcode.

SOURCE CODE SEGMENT

Currently there are no utilities that use this segment, even though it is mentioned in some of the early Parrot documents.

Eventually there will be a more complete and useful PackFile specification, but this simple format works well enough for now (c. Parrot 0.0.5).

AUTHOR

Gregor N. Purdy <gregor@focusresearch.com>