Disassemble::X86::FormatTree - Format machine instructions as a tree
use Disassemble::X86; $d = Disassemble::X86->new(format => "Tree");
This module returns Intel x86 machine instructions as a tree structure, which is suitable for further processing.
The tree consists of hashrefs. There are three common keys, though only op is required:
op
The operation being performed.
The size of the result of the operation, in bits.
The arguments being operated on, in a listref. Each argument is represented by its own hashref.
Top-level nodes may also contain the following keys:
The starting address of the instruction.
The length of the instruction, in bytes.
The minimum processor model required, as described in Disassemble::X86.
Set to 1 if this node is an opcode prefix such as rep or lock.
rep
lock
The op field commonly contains an opcode mnemonic. However, other values may appear.
A machine register.
A literal numeric value.
A reference to memory.
A segment prefix.
The argument list for a register contains the register name followed by its type. Register types include dword and word for general-purpose registers, seg for segment registers, and fp for floating-point registers. If the register is really part of a larger register, that register's name appears as a third arg.
dword
word
seg
fp
That's quite a bit to digest all at once. Here is a simple example:
mov eax,0x1 becomes {op=>"mov", arg=>[ {op=>"reg", size=>32, arg=>["eax", "dword"]}, {op=>"lit", size=>32, arg=>[0x1]} ], start=>1234, len=>5, proc=>386}
That's fairly straightforward. Here's something a bit more involved.
add byte[di+0x4],al becomes {op=>"add", arg=>[ {op=>"mem", size=>8, arg=>[ {op=>"+", size=>16, arg=> [ {op=>"reg", size=>16, arg=>["di", "word", "edi"]}, {op=>"lit", size=>16, arg=>[0x4]} ]} ]} {op=>"reg", size=>8, arg=>["al", "lobyte", "eax"]} ], start=>5678, len=>3, proc=>86}
Notice that the details of the address calculation are encapsulated within the + node. The address is 16 bits long, but the value fetched from memory is only 8 bits. This distinction is captured cleanly.
+
Yes, this is fairly complicated to work with. If you don't need all this complexity, try the FormatText module instead.
$tree = Disassemble::X86::Tree->format_instr($tree);
The format subroutine is a no-op. It returns exactly the same input it is given.
Disassemble::X86
Disassemble::X86::FormatText
Bob Mathews <bobmathews@alumni.calpoly.edu>
Copyright (c) 2002 Bob Mathews. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Disassemble::X86, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Disassemble::X86
CPAN shell
perl -MCPAN -e shell install Disassemble::X86
For more information on module installation, please visit the detailed CPAN module installation guide.