The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

TITLE

Garbage Collection and Dead Object Detection

VERSION

CURRENT

    Maintainer: Dan Sugalski
    Class: Internals
    PDD Number: 9
    Version: 1.1
    Status: Developing
    Last Modified: 26 February, 2002
    PDD Format: 1
    Language: English

HISTORY

Version 1.1

26 February 2002

version 1

None. First version

CHANGES

Version 1.1

Started documenting the internal routines

Version 1.0

None. First version

ABSTRACT

This PDD describes how the GC and DOD systems work, and what's required of PMC classes.

DESCRIPTION

Doing GC takes a bit of work--we need to make sure that everything is findable from the root set, and that we don't go messing up data shared between interpreters.

DOD Steps

All PMCs have their live bit unset
All Buffers have their live bit unset
All PMCs in the root set are put on the trace list
Walk the trace list
The Buffer root set (stack and S registers) is walked, and any referred to buffers are marked as live

For each PMC on the trace list we:

Check its live bit. If set, we skip it
Set its live bit
Check its flags. If it has a custom DOD routine, we call it
If there's no custom DOD routine, we check the flags for one of the following:
Points to a PMC. We put that PMC on the trace list
Points to a Buffer. We mark the buffer as live
Points to a Buffer of PMCs. We put all the PMCs on the trace list
Points to a Buffer of Buffers. We mark those buffers as live
Anything else we ignore
We take the PMC off the trace list

Once we're done, we scan the PMC list twice. In the first scan, dead PMCs (i.e no live bit and no free bit) with a destructor have that destructor called. In the second scan, dead PMCs are put on the free list for later reallocation.

Then we scan the Buffer list. Any dead buffers (no live bit and no free bit) are put on the free buffer list.

GC Steps

These are the steps that the GC takes:

Mark all PMCs with the needs_GC flag
Mark all Buffers with the needs_GC flag
Sweep through all the PMCs with the custom_GC flag set
Sweep through all buffers
Any Buffer with needs_GC and live flags set, but without immobile set, gets copied
Any buffer with needs_GC set and sysmem set, but *not* with live set, gets the system free called on its contents
needs_GC flag is unset after copying

PMCs should only have a custom GC routine if there's really a need for the PMC to keep track of the location of the ultimate buffered data.

For PMCs that need to hand data back to a library when their objects are destroyed, a custom DOD routine is in order, *not* a custom GC routine.

Important safety tips

Never mark PMCs or Buffers owned by other interpreters as needing GC. Bad, very very bad.

IMPLEMENTATION

Blocking GC

Being able to block GC and DOD is important--you'd hate to have the newly allocated Buffers or PMCs you've got yanked out from underneath you. That'd be no fun. Use the following routines to control GC:

Parrot_block_DOD(struct Parrot_Interp *interpreter)

Block DOD for the passed interpreter. (But not GC)

Parrot_block_GC(struct Parrot_Interp *interpreter)

Block GC for the passed interpreter. (But not DOD)

Parrot_unblock_DOD(struct Parrot_Interp *interpreter)

Unblock DOD for the passed interpreter. (But not GC)

Parrot_unblock_GC(struct Parrot_Interp *interpreter)

Unblock GC for the passed interpreter. (But not DOD)

Note that the blocking is recursive--if you call Parrot_block_DOD() three times in succession, you need to call Parrot_unblock_DOD() three times to re-enable DOD.

Important flags

For PMCs and Buffers to be collected properly, you must get the flags set on them properly. Otherwise Bad Things Will Happen.

For PMCs:

PMC_active_destroy_FLAG

The PMC has some sort of active destructor, and will have that destructor called when the PMC is destroyed.

PMC_is_buffer_ptr_FLAG

Set if the data pointer points to a Buffer struct, or something like it. If you point to a buffer and don't set this flag, it'll end up getting collected.

PMC_is_PMC_ptr_FLAG

Set if the data pointer points to a PMC. Set this if it does, or you'll find your PMC collected.

If both PMC_is_PMC_ptr_FLAG and PMC_is_buffer_ptr_FLAG are set, we assume the pointer is to a Buffer holding PMC pointers. (Like, say, the standard array, hash, and list PMC types do) We'll run through the Buffer and mark all the non-NULL pointers as live PMCs.

PMC_private_GC_FLAG

Set if you've got your own private GC, either for marking or collecting. The collector will call your vtable collection routine, and the DOD sweep will call your vtable mark routine.

PMC_live_FLAG

Set if the system considers the PMC alive.

PMC_on_free_list_FLAG

Set if the PMC is on the free list.

PMC_constant_FLAG

Set if this is a constant PMC. Constants never die.

For Buffers (and STRINGs)

BUFFER_constant_FLAG

Set if the buffer is a constant. Constants never die.

BUFFER_immobile_FLAG

Set if the contents of the buffer can't move. Do not ever set this on a buffer pointing to memory allocated with Parrot_allocate. Your memory will be scrozzled all over.

BUFFER_external_FLAG

Set if the buffer points to memory that came from outside Parrot's memory system.

BUFFER_sysmem_FLAG

Set if the memory came from the system malloc. When the buffer is considered dead, the memory will be freed back to the system.

BUFFER_COW_FLAG

The buffer's memory is copy on write. Any changes to the buffer must first have the buffer's memory copied. The COW flag should then be removed.

BUFFER_live_FLAG

The system considers the buffer to be alive for collection purposes.

BUFFER_on_free_list_FLAG

The buffer is unused, and on the free list for later allocation.

Internal routines

go_collect

Scans the Buffer header pools and compacts the used memory. Doesn't check for string liveness or anything of the sort, just does a compaction.

ATTACHMENTS

FOOTNOTES

REFERENCES