The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Allocation and deallocation of SVs.

An SV (or AV, HV, etc.) is allocated in two parts: the head (struct sv, av, hv...) contains type and reference count information, and for many types, a pointer to the body (struct xrv, xpv, xpviv...), which contains fields specific to each type. Some types store all they need in the head, so don't have a body.

In all but the most memory-paranoid configurations (ex: PURIFY), heads and bodies are allocated out of arenas, which by default are approximately 4K chunks of memory parcelled up into N heads or bodies. Sv-bodies are allocated by their sv-type, guaranteeing size consistency needed to allocate safely from arrays.

For SV-heads, the first slot in each arena is reserved, and holds a link to the next arena, some flags, and a note of the number of slots. Snaked through each arena chain is a linked list of free items; when this becomes empty, an extra arena is allocated and divided up into N items which are threaded into the free list.

SV-bodies are similar, but they use arena-sets by default, which separate the link and info from the arena itself, and reclaim the 1st slot in the arena. SV-bodies are further described later.

The following global variables are associated with arenas:

    PL_sv_arenaroot     pointer to list of SV arenas
    PL_sv_root          pointer to list of free SV structures

    PL_body_arenas      head of linked-list of body arenas
    PL_body_roots[]     array of pointers to list of free bodies of svtype
                        arrays are indexed by the svtype needed

A few special SV heads are not allocated from an arena, but are instead directly created in the interpreter structure, eg PL_sv_undef. The size of arenas can be changed from the default by setting PERL_ARENA_SIZE appropriately at compile time.

The SV arena serves the secondary purpose of allowing still-live SVs to be located and destroyed during final cleanup.

At the lowest level, the macros new_SV() and del_SV() grab and free an SV head. (If debugging with -DD, del_SV() calls the function S_del_sv() to return the SV to the free list with error checking.) new_SV() calls more_sv() / sv_add_arena() to add an extra arena if the free list is empty. SVs in the free list have their SvTYPE field set to all ones.

At the time of very final cleanup, sv_free_arenas() is called from perl_destruct() to physically free all the arenas allocated since the start of the interpreter.

The function visit() scans the SV arenas list, and calls a specified function for each SV it finds which is still live - ie which has an SvTYPE other than all 1's, and a non-zero SvREFCNT. visit() is used by the following functions (specified as [function that calls visit()] / [function called by visit() for each SV]):

    sv_report_used() / do_report_used()
                        dump all remaining SVs (debugging aid)

    sv_clean_objs() / do_clean_objs(),do_clean_named_objs(),
                      do_clean_named_io_objs()
                        Attempt to free all objects pointed to by RVs,
                        and try to do the same for all objects indirectly
                        referenced by typeglobs too.  Called once from
                        perl_destruct(), prior to calling sv_clean_all()
                        below.

    sv_clean_all() / do_clean_all()
                        SvREFCNT_dec(sv) each remaining SV, possibly
                        triggering an sv_free(). It also sets the
                        SVf_BREAK flag on the SV to indicate that the
                        refcnt has been artificially lowered, and thus
                        stopping sv_free() from giving spurious warnings
                        about SVs which unexpectedly have a refcnt
                        of zero.  called repeatedly from perl_destruct()
                        until there are no SVs left.

Arena allocator API Summary

Private API to rest of sv.c

    new_SV(),  del_SV(),

    new_XPVNV(), del_XPVGV(),
    etc

Public API:

    sv_report_used(), sv_clean_objs(), sv_clean_all(), sv_free_arenas()

SV Manipulation Functions

Given a chunk of memory, link it to the head of the list of arenas, and split it into a list of free SVs.

Dump the contents of all SVs not yet freed (debugging aid).

Attempt to destroy all objects not yet freed.

Decrement the refcnt of each remaining SV, possibly triggering a cleanup. This function may have to be called multiple times to free SVs which are in complex self-referential hierarchies.

Deallocate the memory used by all arenas. Note that all the individual SV heads and bodies within the arenas must already have been freed.

SV-Body Allocation

Allocation of SV-bodies is similar to SV-heads, differing as follows; the allocation mechanism is used for many body types, so is somewhat more complicated, it uses arena-sets, and has no need for still-live SV detection.

At the outermost level, (new|del)_X*V macros return bodies of the appropriate type. These macros call either (new|del)_body_type or (new|del)_body_allocated macro pairs, depending on specifics of the type. Most body types use the former pair, the latter pair is used to allocate body types with "ghost fields".

"ghost fields" are fields that are unused in certain types, and consequently don't need to actually exist. They are declared because they're part of a "base type", which allows use of functions as methods. The simplest examples are AVs and HVs, 2 aggregate types which don't use the fields which support SCALAR semantics.

For these types, the arenas are carved up into appropriately sized chunks, we thus avoid wasted memory for those unaccessed members. When bodies are allocated, we adjust the pointer back in memory by the size of the part not allocated, so it's as if we allocated the full structure. (But things will all go boom if you write to the part that is "not there", because you'll be overwriting the last members of the preceding structure in memory.)

We calculate the correction using the STRUCT_OFFSET macro on the first member present. If the allocated structure is smaller (no initial NV actually allocated) then the net effect is to subtract the size of the NV from the pointer, to return a new pointer as if an initial NV were actually allocated. (We were using structures named *_allocated for this, but this turned out to be a subtle bug, because a structure without an NV could have a lower alignment constraint, but the compiler is allowed to optimised accesses based on the alignment constraint of the actual pointer to the full structure, for example, using a single 64 bit load instruction because it "knows" that two adjacent 32 bit members will be 8-byte aligned.)

This is the same trick as was used for NV and IV bodies. Ironically it doesn't need to be used for NV bodies any more, because NV is now at the start of the structure. IV bodies don't need it either, because they are no longer allocated.

In turn, the new_body_* allocators call S_new_body(), which invokes new_body_inline macro, which takes a lock, and takes a body off the linked list at PL_body_roots[sv_type], calling Perl_more_bodies() if necessary to refresh an empty list. Then the lock is released, and the body is returned.

Perl_more_bodies allocates a new arena, and carves it up into an array of N bodies, which it strings into a linked list. It looks up arena-size and body-size from the body_details table described below, thus supporting the multiple body-types.

If PURIFY is defined, or PERL_ARENA_SIZE=0, arenas are not used, and the (new|del)_X*V macros are mapped directly to malloc/free.

For each sv-type, struct body_details bodies_by_type[] carries parameters which control these aspects of SV handling:

Arena_size determines whether arenas are used for this body type, and if so, how big they are. PURIFY or PERL_ARENA_SIZE=0 set this field to zero, forcing individual mallocs and frees.

Body_size determines how big a body is, and therefore how many fit into each arena. Offset carries the body-pointer adjustment needed for "ghost fields", and is used in *_allocated macros.

But its main purpose is to parameterize info needed in Perl_sv_upgrade(). The info here dramatically simplifies the function vs the implementation in 5.8.8, making it table-driven. All fields are used for this, except for arena_size.

For the sv-types that have no bodies, arenas are not used, so those PL_body_roots[sv_type] are unused, and can be overloaded. In something of a special case, SVt_NULL is borrowed for HE arenas; PL_body_roots[HE_SVSLOT=SVt_NULL] is filled by S_more_he, but the bodies_by_type[SVt_NULL] slot is not used, as the table is not available in hv.c.

*/

struct body_details { U8 body_size; /* Size to allocate */ U8 copy; /* Size of structure to copy (may be shorter) */ U8 offset; unsigned int type : 4; /* We have space for a sanity check. */ unsigned int cant_upgrade : 1; /* Cannot upgrade this type */ unsigned int zero_nv : 1; /* zero the NV when upgrading from this */ unsigned int arena : 1; /* Allocated from an arena */ size_t arena_size; /* Size of arena to allocate */ };

#define HADNV FALSE #define NONV TRUE

#ifdef PURIFY /* With -DPURFIY we allocate everything directly, and don't use arenas. This seems a rather elegant way to simplify some of the code below. */ #define HASARENA FALSE #else #define HASARENA TRUE #endif #define NOARENA FALSE

/* Size the arenas to exactly fit a given number of bodies. A count of 0 fits the max number bodies into a PERL_ARENA_SIZE.block, simplifying the default. If count > 0, the arena is sized to fit only that many bodies, allowing arenas to be used for large, rare bodies (XPVFM, XPVIO) without undue waste. The arena size is limited by PERL_ARENA_SIZE, so we can safely oversize the declarations. */ #define FIT_ARENA0(body_size) \ ((size_t)(PERL_ARENA_SIZE / body_size) * body_size) #define FIT_ARENAn(count,body_size) \ ( count * body_size <= PERL_ARENA_SIZE) \ ? count * body_size \ : FIT_ARENA0 (body_size) #define FIT_ARENA(count,body_size) \ count \ ? FIT_ARENAn (count, body_size) \ : FIT_ARENA0 (body_size)

/* Calculate the length to copy. Specifically work out the length less any final padding the compiler needed to add. See the comment in sv_upgrade for why copying the padding proved to be a bug. */

#define copy_length(type, last_member) \ STRUCT_OFFSET(type, last_member) \ + sizeof (((type*)SvANY((const SV *)0))->last_member)

static const struct body_details bodies_by_type[] = { /* HEs use this offset for their arena. */ { 0, 0, 0, SVt_NULL, FALSE, NONV, NOARENA, 0 },

    /* The bind placeholder pretends to be an RV for now.
       Also it's marked as "can't upgrade" to stop anyone using it before it's
       implemented.  */
    { 0, 0, 0, SVt_BIND, TRUE, NONV, NOARENA, 0 },

    /* IVs are in the head, so the allocation size is 0.  */
    { 0,
      sizeof(IV), /* This is used to copy out the IV body.  */
      STRUCT_OFFSET(XPVIV, xiv_iv), SVt_IV, FALSE, NONV,
      NOARENA /* IVS don't need an arena  */, 0
    },

    { sizeof(NV), sizeof(NV),
      STRUCT_OFFSET(XPVNV, xnv_u),
      SVt_NV, FALSE, HADNV, HASARENA, FIT_ARENA(0, sizeof(NV)) },

    { sizeof(XPV) - STRUCT_OFFSET(XPV, xpv_cur),
      copy_length(XPV, xpv_len) - STRUCT_OFFSET(XPV, xpv_cur),
      + STRUCT_OFFSET(XPV, xpv_cur),
      SVt_PV, FALSE, NONV, HASARENA,
      FIT_ARENA(0, sizeof(XPV) - STRUCT_OFFSET(XPV, xpv_cur)) },

    { sizeof(XPVIV) - STRUCT_OFFSET(XPV, xpv_cur),
      copy_length(XPVIV, xiv_u) - STRUCT_OFFSET(XPV, xpv_cur),
      + STRUCT_OFFSET(XPV, xpv_cur),
      SVt_PVIV, FALSE, NONV, HASARENA,
      FIT_ARENA(0, sizeof(XPVIV) - STRUCT_OFFSET(XPV, xpv_cur)) },

    { sizeof(XPVNV) - STRUCT_OFFSET(XPV, xpv_cur),
      copy_length(XPVNV, xnv_u) - STRUCT_OFFSET(XPV, xpv_cur),
      + STRUCT_OFFSET(XPV, xpv_cur),
      SVt_PVNV, FALSE, HADNV, HASARENA,
      FIT_ARENA(0, sizeof(XPVNV) - STRUCT_OFFSET(XPV, xpv_cur)) },

    { sizeof(XPVMG), copy_length(XPVMG, xnv_u), 0, SVt_PVMG, FALSE, HADNV,
      HASARENA, FIT_ARENA(0, sizeof(XPVMG)) },

    { sizeof(regexp),
      sizeof(regexp),
      0,
      SVt_REGEXP, FALSE, NONV, HASARENA,
      FIT_ARENA(0, sizeof(regexp))
    },

    { sizeof(XPVGV), sizeof(XPVGV), 0, SVt_PVGV, TRUE, HADNV,
      HASARENA, FIT_ARENA(0, sizeof(XPVGV)) },
    
    { sizeof(XPVLV), sizeof(XPVLV), 0, SVt_PVLV, TRUE, HADNV,
      HASARENA, FIT_ARENA(0, sizeof(XPVLV)) },

    { sizeof(XPVAV),
      copy_length(XPVAV, xav_alloc),
      0,
      SVt_PVAV, TRUE, NONV, HASARENA,
      FIT_ARENA(0, sizeof(XPVAV)) },

    { sizeof(XPVHV),
      copy_length(XPVHV, xhv_max),
      0,
      SVt_PVHV, TRUE, NONV, HASARENA,
      FIT_ARENA(0, sizeof(XPVHV)) },

    { sizeof(XPVCV),
      sizeof(XPVCV),
      0,
      SVt_PVCV, TRUE, NONV, HASARENA,
      FIT_ARENA(0, sizeof(XPVCV)) },

    { sizeof(XPVFM),
      sizeof(XPVFM),
      0,
      SVt_PVFM, TRUE, NONV, NOARENA,
      FIT_ARENA(20, sizeof(XPVFM)) },

    { sizeof(XPVIO),
      sizeof(XPVIO),
      0,
      SVt_PVIO, TRUE, NONV, HASARENA,
      FIT_ARENA(24, sizeof(XPVIO)) },
};

#define new_body_allocated(sv_type) \ (void *)((char *)S_new_body(aTHX_ sv_type) \ - bodies_by_type[sv_type].offset)

/* return a thing to the free list */

#define del_body(thing, root) \ STMT_START { \ void ** const thing_copy = (void **)thing; \ *thing_copy = *root; \ *root = (void*)thing_copy; \ } STMT_END

#ifdef PURIFY

#define new_XNV() safemalloc(sizeof(XPVNV)) #define new_XPVNV() safemalloc(sizeof(XPVNV)) #define new_XPVMG() safemalloc(sizeof(XPVMG))

#define del_XPVGV(p) safefree(p)

#else /* !PURIFY */

#define new_XNV() new_body_allocated(SVt_NV) #define new_XPVNV() new_body_allocated(SVt_PVNV) #define new_XPVMG() new_body_allocated(SVt_PVMG)

#define del_XPVGV(p) del_body(p + bodies_by_type[SVt_PVGV].offset, \ &PL_body_roots[SVt_PVGV])

#endif /* PURIFY */

/* no arena for you! */

#define new_NOARENA(details) \ safemalloc((details)->body_size + (details)->offset) #define new_NOARENAZ(details) \ safecalloc((details)->body_size + (details)->offset, 1)

void * Perl_more_bodies (pTHX_ const svtype sv_type, const size_t body_size, const size_t arena_size) { dVAR; void ** const root = &PL_body_roots[sv_type]; struct arena_desc *adesc; struct arena_set *aroot = (struct arena_set *) PL_body_arenas; unsigned int curr; char *start; const char *end; const size_t good_arena_size = Perl_malloc_good_size(arena_size); #if defined(DEBUGGING) && !defined(PERL_GLOBAL_STRUCT_PRIVATE) static bool done_sanity_check;

    /* PERL_GLOBAL_STRUCT_PRIVATE cannot coexist with global
     * variables like done_sanity_check. */
    if (!done_sanity_check) {
        unsigned int i = SVt_LAST;

        done_sanity_check = TRUE;

        while (i--)
            assert (bodies_by_type[i].type == i);
    }
#endif

    assert(arena_size);

    /* may need new arena-set to hold new arena */
    if (!aroot || aroot->curr >= aroot->set_size) {
        struct arena_set *newroot;
        Newxz(newroot, 1, struct arena_set);
        newroot->set_size = ARENAS_PER_SET;
        newroot->next = aroot;
        aroot = newroot;
        PL_body_arenas = (void *) newroot;
        DEBUG_m(PerlIO_printf(Perl_debug_log, "new arenaset %p\n", (void*)aroot));
    }

    /* ok, now have arena-set with at least 1 empty/available arena-desc */
    curr = aroot->curr++;
    adesc = &(aroot->set[curr]);
    assert(!adesc->arena);
    
    Newx(adesc->arena, good_arena_size, char);
    adesc->size = good_arena_size;
    adesc->utype = sv_type;
    DEBUG_m(PerlIO_printf(Perl_debug_log, "arena %d added: %p size %"UVuf"\n", 
                          curr, (void*)adesc->arena, (UV)good_arena_size));

    start = (char *) adesc->arena;

    /* Get the address of the byte after the end of the last body we can fit.
       Remember, this is integer division:  */
    end = start + good_arena_size / body_size * body_size;

    /* computed count doesn't reflect the 1st slot reservation */
#if defined(MYMALLOC) || defined(HAS_MALLOC_GOOD_SIZE)
    DEBUG_m(PerlIO_printf(Perl_debug_log,
                          "arena %p end %p arena-size %d (from %d) type %d "
                          "size %d ct %d\n",
                          (void*)start, (void*)end, (int)good_arena_size,
                          (int)arena_size, sv_type, (int)body_size,
                          (int)good_arena_size / (int)body_size));
#else
    DEBUG_m(PerlIO_printf(Perl_debug_log,
                          "arena %p end %p arena-size %d type %d size %d ct %d\n",
                          (void*)start, (void*)end,
                          (int)arena_size, sv_type, (int)body_size,
                          (int)good_arena_size / (int)body_size));
#endif
    *root = (void *)start;

    while (1) {
        /* Where the next body would start:  */
        char * const next = start + body_size;

        if (next >= end) {
            /* This is the last body:  */
            assert(next == end);

            *(void **)start = 0;
            return *root;
        }

        *(void**) start = (void *)next;
        start = next;
    }
}

/* grab a new thing from the free list, allocating more if necessary. The inline version is used for speed in hot routines, and the function using it serves the rest (unless PURIFY). */ #define new_body_inline(xpv, sv_type) \ STMT_START { \ void ** const r3wt = &PL_body_roots[sv_type]; \ xpv = (PTR_TBL_ENT_t*) (*((void **)(r3wt)) \ ? *((void **)(r3wt)) : Perl_more_bodies(aTHX_ sv_type, \ bodies_by_type[sv_type].body_size,\ bodies_by_type[sv_type].arena_size)); \ *(r3wt) = *(void**)(xpv); \ } STMT_END

#ifndef PURIFY

STATIC void * S_new_body(pTHX_ const svtype sv_type) { dVAR; void *xpv; new_body_inline(xpv, sv_type); return xpv; }

#endif

static const struct body_details fake_rv = { 0, 0, 0, SVt_IV, FALSE, NONV, NOARENA, 0 };

/* =for apidoc sv_upgrade

Upgrade an SV to a more complex form. Generally adds a new body type to the SV, then copies across as much information as possible from the old body. It croaks if the SV is already in a more complex form than requested. You generally want to use the SvUPGRADE macro wrapper, which checks the type before calling sv_upgrade, and hence does not croak. See also svtype.

Remove any string offset. You should normally use the SvOOK_off macro wrapper instead.

Expands the character buffer in the SV. If necessary, uses sv_unref and upgrades the SV to SVt_PV. Returns a pointer to the character buffer. Use the SvGROW wrapper instead.

Copies an integer into the given SV, upgrading first if necessary. Does not handle 'set' magic. See also sv_setiv_mg.

Like sv_setiv, but also handles 'set' magic.

Copies an unsigned integer into the given SV, upgrading first if necessary. Does not handle 'set' magic. See also sv_setuv_mg.

Like sv_setuv, but also handles 'set' magic.

Copies a double into the given SV, upgrading first if necessary. Does not handle 'set' magic. See also sv_setnv_mg.

Like sv_setnv, but also handles 'set' magic.

Test if the content of an SV looks like a number (or is a number). Inf and Infinity are treated as numbers (so will not issue a non-numeric warning), even if your atof() doesn't grok them. Get-magic is ignored.

Return the integer value of an SV, doing any necessary string conversion. If flags includes SV_GMAGIC, does an mg_get() first. Normally used via the SvIV(sv) and SvIVx(sv) macros.

Return the unsigned integer value of an SV, doing any necessary string conversion. If flags includes SV_GMAGIC, does an mg_get() first. Normally used via the SvUV(sv) and SvUVx(sv) macros.

Return the num value of an SV, doing any necessary string or integer conversion. If flags includes SV_GMAGIC, does an mg_get() first. Normally used via the SvNV(sv) and SvNVx(sv) macros.

Return an SV with the numeric value of the source SV, doing any necessary reference or overload conversion. You must use the SvNUM(sv) macro to access this function.

Returns a pointer to the string value of an SV, and sets *lp to its length. If flags includes SV_GMAGIC, does an mg_get() first. Coerces sv to a string if necessary. Normally invoked via the SvPV_flags macro. sv_2pv() and sv_2pv_nomg usually end up here too.

Copies a stringified representation of the source SV into the destination SV. Automatically performs any necessary mg_get and coercion of numeric values into strings. Guaranteed to preserve UTF8 flag even from overloaded objects. Similar in nature to sv_2pv[_flags] but operates directly on an SV instead of just the string. Mostly uses sv_2pv_flags to do its work, except when that would lose the UTF-8'ness of the PV.

Return a pointer to the byte-encoded representation of the SV, and set *lp to its length. May cause the SV to be downgraded from UTF-8 as a side-effect.

Usually accessed via the SvPVbyte macro.

Return a pointer to the UTF-8-encoded representation of the SV, and set *lp to its length. May cause the SV to be upgraded to UTF-8 as a side-effect.

Usually accessed via the SvPVutf8 macro.

This macro is only used by sv_true() or its macro equivalent, and only if the latter's argument is neither SvPOK, SvIOK nor SvNOK. It calls sv_2bool_flags with the SV_GMAGIC flag.

This function is only used by sv_true() and friends, and only if the latter's argument is neither SvPOK, SvIOK nor SvNOK. If the flags contain SV_GMAGIC, then it does an mg_get() first.

Converts the PV of an SV to its UTF-8-encoded form. Forces the SV to string form if it is not already. Will mg_get on sv if appropriate. Always sets the SvUTF8 flag to avoid future validity checks even if the whole string is the same in UTF-8 as not. Returns the number of bytes in the converted string

This is not as a general purpose byte encoding to Unicode interface: use the Encode extension for that.

Like sv_utf8_upgrade, but doesn't do magic on sv.

Converts the PV of an SV to its UTF-8-encoded form. Forces the SV to string form if it is not already. Always sets the SvUTF8 flag to avoid future validity checks even if all the bytes are invariant in UTF-8. If flags has SV_GMAGIC bit set, will mg_get on sv if appropriate, else not. Returns the number of bytes in the converted string sv_utf8_upgrade and sv_utf8_upgrade_nomg are implemented in terms of this function.

This is not as a general purpose byte encoding to Unicode interface: use the Encode extension for that.

Attempts to convert the PV of an SV from characters to bytes. If the PV contains a character that cannot fit in a byte, this conversion will fail; in this case, either returns false or, if fail_ok is not true, croaks.

This is not as a general purpose Unicode to byte encoding interface: use the Encode extension for that.

Converts the PV of an SV to UTF-8, but then turns the SvUTF8 flag off so that it looks like octets again.

If the PV of the SV is an octet sequence in UTF-8 and contains a multiple-byte character, the SvUTF8 flag is turned on so that it looks like a character. If the PV contains only single-byte characters, the SvUTF8 flag stays off. Scans PV for validity and returns false if the PV is invalid UTF-8.

Copies the contents of the source SV ssv into the destination SV dsv. The source SV may be destroyed if it is mortal, so don't use this function if the source SV needs to be reused. Does not handle 'set' magic. Loosely speaking, it performs a copy-by-value, obliterating any previous content of the destination.

You probably want to use one of the assortment of wrappers, such as SvSetSV, SvSetSV_nosteal, SvSetMagicSV and SvSetMagicSV_nosteal.

Copies the contents of the source SV ssv into the destination SV dsv. The source SV may be destroyed if it is mortal, so don't use this function if the source SV needs to be reused. Does not handle 'set' magic. Loosely speaking, it performs a copy-by-value, obliterating any previous content of the destination. If the flags parameter has the SV_GMAGIC bit set, will mg_get on ssv if appropriate, else not. If the flags parameter has the NOSTEAL bit set then the buffers of temps will not be stolen. <sv_setsv> and sv_setsv_nomg are implemented in terms of this function.

You probably want to use one of the assortment of wrappers, such as SvSetSV, SvSetSV_nosteal, SvSetMagicSV and SvSetMagicSV_nosteal.

This is the primary function for copying scalars, and most other copy-ish functions and macros use this underneath.

Like sv_setsv, but also handles 'set' magic.

Copies a string into an SV. The len parameter indicates the number of bytes to be copied. If the ptr argument is NULL the SV will become undefined. Does not handle 'set' magic. See sv_setpvn_mg.

Like sv_setpvn, but also handles 'set' magic.

Copies a string into an SV. The string must be null-terminated. Does not handle 'set' magic. See sv_setpv_mg.

Like sv_setpv, but also handles 'set' magic.

Tells an SV to use ptr to find its string value. Normally the string is stored inside the SV but sv_usepvn allows the SV to use an outside string. The ptr should point to memory that was allocated by malloc. It must be the start of a mallocked block of memory, and not a pointer to the middle of it. The string length, len, must be supplied. By default this function will realloc (i.e. move) the memory pointed to by ptr, so that pointer should not be freed or used by the programmer after giving it to sv_usepvn, and neither should any pointers from "behind" that pointer (e.g. ptr + 1) be used.

If flags & SV_SMAGIC is true, will call SvSETMAGIC. If flags & SV_HAS_TRAILING_NUL is true, then ptr[len] must be NUL, and the realloc will be skipped (i.e. the buffer is actually at least 1 byte longer than len, and already meets the requirements for storing in SvPVX).

Undo various types of fakery on an SV: if the PV is a shared string, make a private copy; if we're a ref, stop refing; if we're a glob, downgrade to an xpvmg; if we're a copy-on-write scalar, this is the on-write time when we do the copy, and is also used locally. If SV_COW_DROP_PV is set then a copy-on-write scalar drops its PV buffer (if any) and becomes SvPOK_off rather than making a copy. (Used where this scalar is about to be set to some other value.) In addition, the flags parameter gets passed to sv_unref_flags() when unreffing. sv_force_normal calls this function with flags set to 0.

Efficient removal of characters from the beginning of the string buffer. SvPOK(sv) must be true and the ptr must be a pointer to somewhere inside the string buffer. The ptr becomes the first character of the adjusted string. Uses the "OOK hack".

Beware: after this function returns, ptr and SvPVX_const(sv) may no longer refer to the same chunk of data.

The unfortunate similarity of this function's name to that of Perl's chop operator is strictly coincidental. This function works from the left; chop works from the right.

Concatenates the string onto the end of the string which is in the SV. The len indicates number of bytes to copy. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. Handles 'get' magic, but not 'set' magic. See sv_catpvn_mg.

Concatenates the string onto the end of the string which is in the SV. The len indicates number of bytes to copy. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. If flags has the SV_SMAGIC bit set, will mg_set on dsv afterwards if appropriate. sv_catpvn and sv_catpvn_nomg are implemented in terms of this function.

Concatenates the string from SV ssv onto the end of the string in SV dsv. Modifies dsv but not ssv. Handles 'get' magic, but not 'set' magic. See sv_catsv_mg.

Concatenates the string from SV ssv onto the end of the string in SV dsv. Modifies dsv but not ssv. If flags has SV_GMAGIC bit set, will mg_get on the ssv, if appropriate, before reading it. If the flags contain SV_SMAGIC, mg_set will be called on the modified SV afterward, if appropriate. sv_catsv and sv_catsv_nomg are implemented in terms of this function.

Concatenates the string onto the end of the string which is in the SV. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. Handles 'get' magic, but not 'set' magic. See sv_catpv_mg.

Concatenates the string onto the end of the string which is in the SV. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. If flags has the SV_SMAGIC bit set, will mg_set on the modified SV if appropriate.

Like sv_catpv, but also handles 'set' magic.

Creates a new SV. A non-zero len parameter indicates the number of bytes of preallocated string space the SV should have. An extra byte for a trailing NUL is also reserved. (SvPOK is not set for the SV even if string space is allocated.) The reference count for the new SV is set to 1.

In 5.9.3, newSV() replaces the older NEWSV() API, and drops the first parameter, x, a debug aid which allowed callers to identify themselves. This aid has been superseded by a new build option, PERL_MEM_LOG (see "PERL_MEM_LOG" in perlhacktips). The older API is still there for use in XS modules supporting older perls.

Adds magic to an SV, upgrading it if necessary. Applies the supplied vtable and returns a pointer to the magic added.

Note that sv_magicext will allow things that sv_magic will not. In particular, you can add magic to SvREADONLY SVs, and add more than one instance of the same 'how'.

If namlen is greater than zero then a savepvn copy of name is stored, if namlen is zero then name is stored as-is and - as another special case - if (name && namlen == HEf_SVKEY) then name is assumed to contain an SV* and is stored as-is with its REFCNT incremented.

(This is now used as a subroutine by sv_magic.)

Adds magic to an SV. First upgrades sv to type SVt_PVMG if necessary, then adds a new magic item of type how to the head of the magic list.

See sv_magicext (which sv_magic now calls) for a description of the handling of the name and namlen arguments.

You need to use sv_magicext to add magic to SvREADONLY SVs and also to add more than one instance of the same 'how'.

Removes all magic of type type from an SV.

Removes all magic of type type with the specified vtbl from an SV.

Weaken a reference: set the SvWEAKREF flag on this RV; give the referred-to SV PERL_MAGIC_backref magic if it hasn't already; and push a back-reference to this RV onto the array of backreferences associated with that magic. If the RV is magical, set magic will be called after the RV is cleared.

Inserts a string at the specified offset/length within the SV. Similar to the Perl substr() function. Handles get magic.

Same as sv_insert, but the extra flags are passed to the SvPV_force_flags that applies to bigstr.

Make the first argument a copy of the second, then delete the original. The target SV physically takes over ownership of the body of the source SV and inherits its flags; however, the target keeps any magic it owns, and any magic in the source is discarded. Note that this is a rather specialist SV copying operation; most of the time you'll want to use sv_setsv or one of its many macro front-ends.

Clear an SV: call any destructors, free up any memory used by the body, and free the body itself. The SV's head is not freed, although its type is set to all 1's so that it won't inadvertently be assumed to be live during global destruction etc. This function should only be called when REFCNT is zero. Most of the time you'll want to call sv_free() (or its macro wrapper SvREFCNT_dec) instead.

Increment an SV's reference count. Use the SvREFCNT_inc() wrapper instead.

Decrement an SV's reference count, and if it drops to zero, call sv_clear to invoke destructors and free up any memory used by the body; finally, deallocate the SV's head itself. Normally called via a wrapper macro SvREFCNT_dec.

Returns the length of the string in the SV. Handles magic and type coercion. See also SvCUR, which gives raw access to the xpv_cur slot.

Returns the number of characters in the string in an SV, counting wide UTF-8 bytes as a single character. Handles magic and type coercion.

Converts the value pointed to by offsetp from a count of UTF-8 chars from the start of the string, to a count of the equivalent number of bytes; if lenp is non-zero, it does the same to lenp, but this time starting from the offset, rather than from the start of the string. Handles type coercion. flags is passed to SvPV_flags, and usually should be SV_GMAGIC|SV_CONST_RETURN to handle magic.

Converts the value pointed to by offsetp from a count of UTF-8 chars from the start of the string, to a count of the equivalent number of bytes; if lenp is non-zero, it does the same to lenp, but this time starting from the offset, rather than from the start of the string. Handles magic and type coercion.

Use sv_pos_u2b_flags in preference, which correctly handles strings longer than 2Gb.

Converts the value pointed to by offsetp from a count of bytes from the start of the string, to a count of the equivalent number of UTF-8 chars. Handles magic and type coercion.

Returns a boolean indicating whether the strings in the two SVs are identical. Is UTF-8 and 'use bytes' aware, handles get magic, and will coerce its args to strings if necessary.

Returns a boolean indicating whether the strings in the two SVs are identical. Is UTF-8 and 'use bytes' aware and coerces its args to strings if necessary. If the flags include SV_GMAGIC, it handles get-magic, too.

Compares the strings in two SVs. Returns -1, 0, or 1 indicating whether the string in sv1 is less than, equal to, or greater than the string in sv2. Is UTF-8 and 'use bytes' aware, handles get magic, and will coerce its args to strings if necessary. See also sv_cmp_locale.

Compares the strings in two SVs. Returns -1, 0, or 1 indicating whether the string in sv1 is less than, equal to, or greater than the string in sv2. Is UTF-8 and 'use bytes' aware and will coerce its args to strings if necessary. If the flags include SV_GMAGIC, it handles get magic. See also sv_cmp_locale_flags.

Compares the strings in two SVs in a locale-aware manner. Is UTF-8 and 'use bytes' aware, handles get magic, and will coerce its args to strings if necessary. See also sv_cmp.

Compares the strings in two SVs in a locale-aware manner. Is UTF-8 and 'use bytes' aware and will coerce its args to strings if necessary. If the flags contain SV_GMAGIC, it handles get magic. See also sv_cmp_flags.

This calls sv_collxfrm_flags with the SV_GMAGIC flag. See sv_collxfrm_flags.

Add Collate Transform magic to an SV if it doesn't already have it. If the flags contain SV_GMAGIC, it handles get-magic.

Any scalar variable may carry PERL_MAGIC_collxfrm magic that contains the scalar data of the variable, but transformed to such a format that a normal memory comparison can be used to compare the data according to the locale settings.

Get a line from the filehandle and store it into the SV, optionally appending to the currently-stored string.

Auto-increment of the value in the SV, doing string to numeric conversion if necessary. Handles 'get' magic and operator overloading.

Auto-increment of the value in the SV, doing string to numeric conversion if necessary. Handles operator overloading. Skips handling 'get' magic.

Auto-decrement of the value in the SV, doing string to numeric conversion if necessary. Handles 'get' magic and operator overloading.

Auto-decrement of the value in the SV, doing string to numeric conversion if necessary. Handles operator overloading. Skips handling 'get' magic.

Creates a new SV which is a copy of the original SV (using sv_setsv). The new SV is marked as mortal. It will be destroyed "soon", either by an explicit call to FREETMPS, or by an implicit call at places such as statement boundaries. See also sv_newmortal and sv_2mortal.

Creates a new null SV which is mortal. The reference count of the SV is set to 1. It will be destroyed "soon", either by an explicit call to FREETMPS, or by an implicit call at places such as statement boundaries. See also sv_mortalcopy and sv_2mortal.

Creates a new SV and copies a string into it. The reference count for the SV is set to 1. Note that if len is zero, Perl will create a zero length string. You are responsible for ensuring that the source string is at least len bytes long. If the s argument is NULL the new SV will be undefined. Currently the only flag bits accepted are SVf_UTF8 and SVs_TEMP. If SVs_TEMP is set, then sv_2mortal() is called on the result before returning. If SVf_UTF8 is set, s is considered to be in UTF-8 and the SVf_UTF8 flag will be set on the new SV. newSVpvn_utf8() is a convenience wrapper for this function, defined as

    #define newSVpvn_utf8(s, len, u)                    \
        newSVpvn_flags((s), (len), (u) ? SVf_UTF8 : 0)

Marks an existing SV as mortal. The SV will be destroyed "soon", either by an explicit call to FREETMPS, or by an implicit call at places such as statement boundaries. SvTEMP() is turned on which means that the SV's string buffer can be "stolen" if this SV is copied. See also sv_newmortal and sv_mortalcopy.

Creates a new SV and copies a string into it. The reference count for the SV is set to 1. If len is zero, Perl will compute the length using strlen(). For efficiency, consider using newSVpvn instead.

Creates a new SV and copies a buffer into it, which may contain NUL characters (\0) and other binary data. The reference count for the SV is set to 1. Note that if len is zero, Perl will create a zero length (Perl) string. You are responsible for ensuring that the source buffer is at least len bytes long. If the buffer argument is NULL the new SV will be undefined.

Creates a new SV from the hash key structure. It will generate scalars that point to the shared string table where possible. Returns a new (undefined) SV if the hek is NULL.

Creates a new SV with its SvPVX_const pointing to a shared string in the string table. If the string does not already exist in the table, it is created first. Turns on READONLY and FAKE. If the hash parameter is non-zero, that value is used; otherwise the hash is computed. The string's hash can later be retrieved from the SV with the SvSHARED_HASH() macro. The idea here is that as the string table is used for shared hash keys these strings will have SvPVX_const == HeKEY and hash lookup will avoid string compare.

Like newSVpvn_share, but takes a nul-terminated string instead of a string/length pair.

Creates a new SV and initializes it with the string formatted like sprintf.

Creates a new SV and copies a floating point value into it. The reference count for the SV is set to 1.

Creates a new SV and copies an integer into it. The reference count for the SV is set to 1.

Creates a new SV and copies an unsigned integer into it. The reference count for the SV is set to 1.

Creates a new SV, of the type specified. The reference count for the new SV is set to 1.

Creates an RV wrapper for an SV. The reference count for the original SV is not incremented.

Creates a new SV which is an exact duplicate of the original SV. (Uses sv_setsv.)

Underlying implementation for the reset Perl function. Note that the perl-level function is vaguely deprecated.

Using various gambits, try to get an IO from an SV: the IO slot if its a GV; or the recursive result if we're an RV; or the IO slot of the symbol named after the PV if we're a string.

'Get' magic is ignored on the sv passed in, but will be called on SvRV(sv) if sv is an RV.

Using various gambits, try to get a CV from an SV; in addition, try if possible to set *st and *gvp to the stash and GV associated with it. The flags in lref are passed to gv_fetchsv.

Returns true if the SV has a true value by Perl's rules. Use the SvTRUE macro instead, which may call sv_true() or may instead use an in-line version.

Get a sensible string out of the SV somehow. A private implementation of the SvPV_force macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

Get a sensible string out of the SV somehow. If flags has SV_GMAGIC bit set, will mg_get on sv if appropriate, else not. sv_pvn_force and sv_pvn_force_nomg are implemented in terms of this function. You normally want to use the various wrapper macros instead: see SvPV_force and SvPV_force_nomg

The backend for the SvPVbytex_force macro. Always use the macro instead.

The backend for the SvPVutf8x_force macro. Always use the macro instead.

Returns a string describing what the SV is a reference to.

Returns a SV describing what the SV passed in is a reference to.

Returns a boolean indicating whether the SV is an RV pointing to a blessed object. If the SV is not an RV, or if the object is not blessed, then this will return false.

Returns a boolean indicating whether the SV is blessed into the specified class. This does not check for subtypes; use sv_derived_from to verify an inheritance relationship.

Creates a new SV for the RV, rv, to point to. If rv is not an RV then it will be upgraded to one. If classname is non-null then the new SV will be blessed in the specified package. The new SV is returned and its reference count is 1.

Copies a pointer into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. If the pv argument is NULL then PL_sv_undef will be placed into the SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

Do not use with other Perl types such as HV, AV, SV, CV, because those objects will become corrupted by the pointer copy process.

Note that sv_setref_pvn copies the string while this copies the pointer.

Copies an integer into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

Copies an unsigned integer into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

Copies a double into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

Copies a string into a new SV, optionally blessing the SV. The length of the string must be specified with n. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

Note that sv_setref_pv copies the pointer while this copies the string.

Blesses an SV into a specified package. The SV must be an RV. The package must be designated by its stash (see gv_stashpv()). The reference count of the SV is unaffected.

Unsets the RV status of the SV, and decrements the reference count of whatever was being referenced by the RV. This can almost be thought of as a reversal of newSVrv. The cflags argument can contain SV_IMMEDIATE_UNREF to force the reference count to be decremented (otherwise the decrementing is conditional on the reference count being different from one or the reference being a readonly SV). See SvROK_off.

Untaint an SV. Use SvTAINTED_off instead.

Test an SV for taintedness. Use SvTAINTED instead.

Copies an integer into the given SV, also updating its string value. Does not handle 'set' magic. See sv_setpviv_mg.

Like sv_setpviv, but also handles 'set' magic.

Works like sv_catpvf but copies the text into the SV instead of appending it. Does not handle 'set' magic. See sv_setpvf_mg.

Works like sv_vcatpvf but copies the text into the SV instead of appending it. Does not handle 'set' magic. See sv_vsetpvf_mg.

Usually used via its frontend sv_setpvf.

Like sv_setpvf, but also handles 'set' magic.

Like sv_vsetpvf, but also handles 'set' magic.

Usually used via its frontend sv_setpvf_mg.

Processes its arguments like sprintf and appends the formatted output to an SV. If the appended data contains "wide" characters (including, but not limited to, SVs with a UTF-8 PV formatted with %s, and characters >255 formatted with %c), the original SV might get upgraded to UTF-8. Handles 'get' magic, but not 'set' magic. See sv_catpvf_mg. If the original SV was UTF-8, the pattern should be valid UTF-8; if the original SV was bytes, the pattern should be too.

Processes its arguments like vsprintf and appends the formatted output to an SV. Does not handle 'set' magic. See sv_vcatpvf_mg.

Usually used via its frontend sv_catpvf.

Like sv_catpvf, but also handles 'set' magic.

Like sv_vcatpvf, but also handles 'set' magic.

Usually used via its frontend sv_catpvf_mg.

Works like sv_vcatpvfn but copies the text into the SV instead of appending it.

Usually used via one of its frontends sv_vsetpvf and sv_vsetpvf_mg.

Processes its arguments like vsprintf and appends the formatted output to an SV. Uses an array of SVs if the C style variable argument list is missing (NULL). When running with taint checks enabled, indicates via maybe_tainted if results are untrustworthy (often due to the use of locales).

Usually used via one of its frontends sv_vcatpvf and sv_vcatpvf_mg.

Cloning an interpreter

All the macros and functions in this section are for the private use of the main function, perl_clone().

The foo_dup() functions make an exact copy of an existing foo thingy. During the course of a cloning, a hash table is used to map old addresses to new addresses. The table is created and manipulated with the ptr_table_* functions.

Create and return a new interpreter by cloning the current one.

perl_clone takes these flags as parameters:

CLONEf_COPY_STACKS - is used to, well, copy the stacks also, without it we only clone the data and zero the stacks, with it we copy the stacks and the new perl interpreter is ready to run at the exact same point as the previous one. The pseudo-fork code uses COPY_STACKS while the threads->create doesn't.

CLONEf_KEEP_PTR_TABLE - perl_clone keeps a ptr_table with the pointer of the old variable as a key and the new variable as a value, this allows it to check if something has been cloned and not clone it again but rather just use the value and increase the refcount. If KEEP_PTR_TABLE is not set then perl_clone will kill the ptr_table using the function ptr_table_free(PL_ptr_table); PL_ptr_table = NULL;, reason to keep it around is if you want to dup some of your own variable who are outside the graph perl scans, example of this code is in threads.xs create.

CLONEf_CLONE_HOST - This is a win32 thing, it is ignored on unix, it tells perls win32host code (which is c++) to clone itself, this is needed on win32 if you want to run two threads at the same time, if you just want to do some stuff in a separate perl interpreter and then throw it away and return to the original one, you don't need to do anything.

Unicode Support

The encoding is assumed to be an Encode object, on entry the PV of the sv is assumed to be octets in that encoding, and the sv will be converted into Unicode (and UTF-8).

If the sv already is UTF-8 (or if it is not POK), or if the encoding is not a reference, nothing is done to the sv. If the encoding is not an Encode::XS Encoding object, bad things will happen. (See lib/encoding.pm and Encode.)

The PV of the sv is returned.

The encoding is assumed to be an Encode object, the PV of the ssv is assumed to be octets in that encoding and decoding the input starts from the position which (PV + *offset) pointed to. The dsv will be concatenated the decoded UTF-8 string from ssv. Decoding will terminate when the string tstr appears in decoding output or the input ends on the PV of the ssv. The value which the offset points will be modified to the last input position on the ssv.

Returns TRUE if the terminator was found, else returns FALSE.

Find the name of the undefined variable (if any) that caused the operator to issue a "Use of uninitialized value" warning. If match is true, only return a name if its value matches uninit_sv. So roughly speaking, if a unary operator (such as OP_COS) generates a warning, then following the direct child of the op may yield an OP_PADSV or OP_GV that gives the name of the undefined variable. On the other hand, with OP_ADD there are two branches to follow, so we only print the variable name if we get an exact match.

The name is returned as a mortal SV.

Assumes that PL_op is the op that originally triggered the error, and that PL_comppad/PL_curpad points to the currently executing pad.

Print appropriate "Use of uninitialized variable" warning.