The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

perldelta - what is new for perl v5.8.1

DESCRIPTION

This document describes differences between the 5.8.0 release and the 5.8.1 release.

If you are upgrading from an earlier release like 5.6.1, read first the perl58delta, which describes differences between 5.6.0 and 5.8.0.

Incompatible Changes

Hash Randomisation

Mainly due to security reasons, the "random ordering" of hashes has been made even more random. Previously while the order of hash elements from keys(), values(), and each() was essentially random, it was still repeatable. Now, however, the order varies between different runs of Perl.

Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.

The added randomness may affect applications.

One possible scenario is when output of an application has included hash data. For example, if you have used the Data::Dumper module to dump data into different files, and then compared the files to see whether the data has changed, now you will have false positives since the order in which hashes are dumped will vary. In general the cure is to sort the keys (or the values); in particular for Data::Dumper to use the Sortkeys option; or if some particular order is really important, use tied hashes.

More subtle problem is reliance on the order of "global destruction". That is what happens at the end of execution: Perl destroys all data structures, including user data. If your destructors (the DESTROY subroutines) have assumed any particular ordering to the global destruction, there might be problems ahead. For example, in a destructor of one object you cannot assume that objects of any other class are still available, unless you hold a reference to them. If the environment variable PERL_DESTRUCT_LEVEL is set to a non-zero value, or if Perl is exiting a spawned thread, it will also destruct the ordinary references and the symbol tables that are no longer in use. You can't call a class method or an ordinary function on a class that has been collected that way.

The hash randomisation is certain to reveal hidden assumptions about some particular ordering of hash elements, and outright bugs: it revealed a few bugs in the Perl core and core modules.

To disable the hash randomisation in runtime, set the environment variable PERL_HASH_SEED to 0 (zero) before running Perl (for more information see "PERL_HASH_SEED" in perlrun), or to disable the feature completely in compile time, compile with -DNO_HASH_SEED (see INSTALL).

See "Algorithmic Complexity Attacks" in perlsec for the original rationale behind this change.

UTF-8 On Filehandles No Longer Activated By Locale

In Perl 5.8.0 all filehandles, including the standard filehandles, were implicitly set to be in Unicode UTF-8 if the locale settings indicated the use of UTF-8. This feature caused too many problems, so the feature was turned off and redesigned: see "Core Enhancements".

Single-number v-strings are no longer v-strings before "=>"

The version strings or v-strings (see "Version Strings" in perldata) feature introduced in Perl 5.6.0 has been a source of some confusion-- especially when the user did not want to use it, but Perl thought it knew better. Especially troublesome has been the feature that before a "=>" a version string (a "v" followed by digits) has been interpreted as a v-string instead of a string literal. In other words:

        %h = ( v65 => 42 );

has meant since Perl 5.6.0

        %h = ( 'A' => 42 );

(at least in platforms of ASCII progeny) Perl 5.8.1 restores the more natural interpretation

        %h = ( 'v65' => 42 );

The multi-number v-strings like v65.66 and 65.66.67 still continue to be v-strings.

(Win32) The -C Switch Has Been Repurposed

The -C switch has changed in an incompatible way. The old semantics of this switch only made sense in Win32 and only in the "use utf8" universe in 5.6.x releases, and do not make sense for the Unicode implementation in 5.8.0. Since this switch could not have been used by anyone, it has been repurposed. The behavior that this switch enabled in 5.6.x releases may be supported in a transparent, data-dependent fashion in a future release.

For the new life of this switch, see "UTF-8 no longer default under UTF-8 locales", and "-C" in perlrun.

(Win32) The /d Switch Of cmd.exe

Perl 5.8.1 uses the /d switch when running the cmd.exe shell internally for system(), backticks, and when opening pipes to external programs. The extra switch disables the execution of AutoRun commands from the registry, which is generally considered undesirable when running external programs. If you wish to retain compatibility with the older behavior, set PERL5SHELL in your environment to cmd /x/c.

Core Enhancements

UTF-8 no longer default under UTF-8 locales

In Perl 5.8.0 many Unicode features were introduced. One of them was found to be of more nuisance than benefit: the automagic (and silent) "UTF-8-ification" of filehandles, including the standard filehandles, if the user's locale settings indicated use of UTF-8.

For example, if you had en_US.UTF-8 as your locale, your STDIN and STDOUT were automatically "UTF-8", in other words an implicit binmode(..., ":utf8") was made. This meant that trying to print, say, chr(0xff), ended up printing the bytes 0xc3 0xbf. Hardly what you had in mind unless you were aware of this feature of Perl 5.8.0. The problem is that the vast majority of people weren't: for example in RedHat releases 8 and 9 the default locale setting is UTF-8, so all RedHat users got UTF-8 filehandles, whether they wanted it or not. The pain was intensified by the Unicode implementation of Perl 5.8.0 (still) having nasty bugs, especially related to the use of s/// and tr///. (Bugs that have been fixed in 5.8.1)

Therefore a decision was made to backtrack the feature and change it from implicit silent default to explicit conscious option. The new Perl command line option -C and its counterpart environment variable PERL_UNICODE can now be used to control how Perl and Unicode interact at interfaces like I/O and for example the command line arguments. See perlrun for more information.

Unsafe signals again available

In Perl 5.8.0 the so-called "safe signals" were introduced. This means that Perl no longer handles signals immediately but instead "between opcodes", when it is safe to do so. The earlier immediate handling easily could corrupt the internal state of Perl, resulting in mysterious crashes.

However, the new safer model has its problems too. Because now an opcode, a basic unit of Perl execution, is never interrupted but instead let to run to completion, certain operations that can take a long time now really do take a long time. For example, certain network operations have their own blocking and timeout mechanisms, and being able to interrupt them immediately would be nice.

Therefore perl 5.8.1 introduces a "backdoor" to restore the pre-5.8.0 (5.7.3, really) signal behaviour. Just set the environment variable PERL_SIGNALS to unsafe, and the old immediate (and unsafe) signal handling behaviour returns.

Tied Arrays with Negative Array Indices

Formerly, the indices passed to FETCH, STORE, EXISTS, and DELETE methods in tied array class were always non-negative. If the actual argument was negative, Perl would call FETCHSIZE implicitly and add the result to the index before passing the result to the tied array method. This behaviour is now optional. If the tied array class contains a package variable named $NEGATIVE_INDICES which is set to a true value, negative values will be passed to FETCH, STORE, EXISTS, and DELETE unchanged.

local ${$x}

The syntaxes

        local ${$x}
        local @{$x}
        local %{$x}

now do localise variables, given that the $x is a valid variable name.

Unicode Character Database 4.0.0

Unicode Character Database has been updated to 4.0.0 from 3.2.0.

Warnings

Perl 5.8.0 forgot to add some deprecation warnings. These warnings have now been added.

Pseudo-hashes really are deprecated

Pseudo-hashes were deprecated in Perl 5.8.0 and will be removed in Perl 5.10.0, see perl58delta for details. Each attempt to access pseudo-hashes will trigger the warning Pseudo-hashes are deprecated. If you really want to continue using pseudo-hashes but not to see the deprecation warnings, add:

    no warnings 'deprecated';

Or you can continue to use the fields pragma, but please don't expect the data structures to be pseudohashes any more.

5.005-style threads really are deprecated

5.005-style threads (activated by use Thread;) were deprecated in Perl 5.8.0 and will be removed in Perl 5.10.0, see perl58delta for details. Each attempt to create a 5.005-style thread will trigger the warning 5.005 threads are deprecated. If you really want to continue using 5.005 threads but not to see the deprecation warnings, add:

    no warnings 'deprecated';

Version Strings (v-strings) Are Deprecated

Version Strings (v-strings) have been deprecated. They will not be available after Perl 5.8. The marginal benefits of v-strings were greatly outweighed by the potential for Surprise and Confusion.

The $* variable Is Deprecated

The $* variable controlling multi-line matching has been deprecated. It will not be available after Perl 5.8. The variable has been deprecated for a long time, now it will just finally be removed. The functionality has been supplanted by the /s and /m modifiers on pattern matching.

Modules and Pragmata

Updated Modules

B::Concise

B::Deparse

Benchmark - An optional feature, :hireswallclock, now allows for high resolution wall clock times (uses Time::HiRes).

CGI

charnames - One can now have custom character name aliases.

CPAN - There is now a simple command line frontend to the CPAN.pm module called cpan.

Data::Dumper - A new option, Pair, allows choosing the separator between hash keys and values.

DB_File

Devel::PPPort

Digest::MD5

Encode - significant updates on the encoding pragma functionality (tr/// and the DATA filehandle, formats). The ISO 8859-6 conversion table has been corrected (the 0x30..0x39 errorneously mapped to U+0660..U+0669, instead of U+0030..U+0039), the UTF7 encoding has been added.

libnet

Math::BigInt

MIME::Base64

Net::Ping

podlators

Pod::LaTeX

PodParsers

Pod::Perldoc

Scalar::Util - New utilities: refaddr, isvstring, looks_like_number, set_prototype.

Storable - can now store code references (via B::Deparse, so not foolproof).

Term::ANSIcolor

Test::Harness - now much more picky about extra or missing output from test scripts.

Test::More

Test::Simple

Text::Balanced

Time::HiRes - Use of nanosleep(), if available, allows mixing subsecond sleeps with alarms.

threads - Several fixes, for example for join() problems and memory leaks. In some platforms (like Linux) that use glibc the memory footprint of one ithread has been reduced by several hundred kilobytes.

threads::shared - Many memory leaks have been fixed.

Unicode::Collate

Unicode::Normalize

Win32::GetFolderPath

Utility Changes

The Perl debugger (lib/perl5db.pl) has now been extensively documented and bugs found while documenting have been fixed.

perldoc has been rewritten from scratch to be more robust and featureful.

New Documentation

perl573delta has been added to list the differences between the (now quite obsolete) development releases 5.7.2 and 5.7.3.

perl58delta has been added: it is the perldelta of 5.8.0, detailing the differences between 5.6.0 and 5.8.0.

perlartistic has been added: it is the Artistic License in pod format, making it easier for modules to refer to it.

perlgpl has been added: it is the GNU General Public License in pod format, making it easier for modules to refer to it.

perlos400 has been added to tell about the installation and use of Perl in OS/400 PASE.

Installation and Configuration Improvements

The UNIX standard Perl location, /usr/bin/perl, is no longer overwritten by default if it exists. This change was very prudent because so many UNIX vendors already provide a /usr/bin/perl, but simultaneously many system utilities may depend on that exact version of Perl, so better not to overwrite it.

One can now specify installation directories for site and vendor man and HTML pages, and site and vendor scripts. See INSTALL.

gcc versions 3.x introduced a new warning that caused a lot of noise during Perl compilation: gcc -Ialreadyknowndirectory (warning: changing search order). This warning has now been avoided by Configure weeding out such directories before the compilation.

One can now build subsets of Perl core modules by using the Configure flags -Dnoextensions= and -Donlyextensions=, see INSTALL

Mac OS X now installs with Perl version number embedded in installation directory names for easier upgrading of user-compiled Perl, and the installation directories in general are more standard. (In other words, the default installation no longer breaks the Apple-provided Perl.)

In newer FreeBSD releases Perl 5.8.0 compilation failed because of trying to use <malloc.h>, which in FreeBSD is just a dummy file, and a fatal error to even try to use. Now <malloc.h> is not used.

Tru64 gcc 3.2.1 -O3 toke.c dropped to -O2 because of gigantic memory use otherwise.

Tru64 can now build Perl with the newer Berkeley DBs.

Perl has been ported to IBM's OS/400 PASE environment. The best way to build a Perl for PASE is to use an AIX host as a cross-compilation environment. See README.os400.

Perl is now known to build also in Hitachi HI-UXMPP.

Building Perl on WinCE has been much enhanced, see the wince subdirectory.

Yet another cross-compilation option has been added, now Perl builds on OpenZaurus, see the Cross subdirectory.

Selected Bug Fixes

Closures, eval and lexicals

There have been many fixes in the area of anonymous subs, lexicals and closures. Although this means that Perl is now more "correct", it is possible that some existing code will break that happens to rely on the faulty behaviour. In practice this is unlikely unless your code contains a very complex nesting of anonymous subs, evals and lexicals.

Generic fixes

binmode(SOCKET, ":utf8") only worked on the input side, not on the output side of the socket. Now it works both ways.

For threaded Perls certain system database functions like getpwent() and getgrent() now grow their result buffer dynamically, instead of failing. This means that at sites with lots of users and groups the functions no longer fail by returning only partial results.

Perl 5.8.0 had accidentally broken the capability for users to define their own uppercase<->lowercase Unicode mappings (as advertised by the Camel). This feature has been fixed and is also documented better.

In 5.8.0 this

        $some_unicode .= <FH>;

didn't work correctly but instead corrupted the data. This has now been fixed.

FETCH etc may now safely access tied values (ie resulting in a recursive call to FETCH etc).

Linenumbers in Perl scripts may now be greater than 65536, or 2**16. (Perl scripts have always been able to be larger than that, it's just that the linenumber for reported errors and warnings have "wrapped around".) While scripts that large usually indicate a need to rethink your code a bit, such Perl scripts do exist, for example as results from generated code. Now linenumbers can go all the way to 4294967296, or 2**32.

Module-specific fixes

Encode: if a filehandle has been marked as to have an encoding, unmappable characters are detected already during input, not later (when the corrupted data is being used).

PerlIO::scalar; reading from non-string scalars (like the special variables, see perlvar) now works.

Platform-specific fixes

Linux

  • Setting $0 works again (with certain limitations that Perl cannot do much about: see "$0" in perlvar)

HP-UX

  • Setting $0 now works.

VMS

  • IO::Poll now works.

Win32

  • A memory leak in the fork() emulation has been fixed.

  • The return value of the ioctl() built-in function was accidentally broken in 5.8.0. This has been corrected.

  • The internal message loop executed by perl during blocking operations sometimes interfered with messages that were external to Perl. This often resulted in blocking operations terminating prematurely or returning incorrect results, when Perl was executing under environments that could generate Windows messages. This has been corrected.

  • Pipes and sockets are now automatically in binary mode.

  • The four-argument form of select() did not preserve $! (errno) properly when there were errors in the underlying call. This is now fixed.

New or Changed Diagnostics

All the warnings related to pack() and unpack() were made more informative and consistent.

Changed "A thread exited while %d threads were running"

The old version

    A thread exited while %d other threads were still running

was misleading because the "other" included also the thread giving the warning.

Removed "Attempt to clear a restricted hash"

It is not illegal to clear a restricted hash, so the warning was removed.

New "Can't provide tied hash usage; use keys(%hash) to test if empty"

Use of %hash in scalar context (like for example as the test of if) is not available for tied hashes. Currently any tied hash, elements or not, returns a true value in scalar context. This is probably not what you had in mind, so Perl aborts.

New "Illegal declaration of anonymous subroutine"

You must specify the block of code for sub.

Changed "Invalid range "%s" in transliteration operator"

The old version

    Invalid [] range "%s" in transliteration operator

was simply wrong because there are no "[] ranges" in tr///.

New "Missing control char name in \c"

Self-explanatory.

New "Newline in left-justified string for %s"

The padding spaces would appear after the newline, which is probably not what you had in mind.

New "Possible precedence problem on bitwise %c operator"

If you think this

    $x & $y == 0

tests whether the bitwise AND of $x and $y is zero, you will like this warning.

New "Pseudo-hashes are deprecated"

This warning should have been already in 5.8.0, since they are.

New "read() on %s filehandle %s"

You cannot read() (or sysread()) from a closed or unopened filehandle.

New "5.005 threads are deprecated"

This warning should have been already in 5.8.0, since they are.

New "Tied variable freed while still in use"

Something pulled the plug on a live free variable, Perl plays safe by bailing out.

New "To%s: illegal mapping '%s'"

An illegal user-defined Unicode casemapping was specified.

New "Use of freed value in iteration (perhaps you modified the iterated array within the loop?)"

Something modified the values being iterated over. This is not good.

Changed Internals

These news matter to you only if you either write XS code or like to hack Perl internals, or like to run Perl with the -D option.

The embedding examples of perlembed have been reviewed to be uptodate and consistent: for example, the correct use of PERL_SYS_INIT3() and PERL_SYS_TERM().

Extensive reworking of the pad code has been conducted by Dave Mitchell.

Extensive work on the v-strings by John Peacock.

UTF-8 length and position cache: to speed up the handling of Unicode (UTF-8) scalars, a cache was introduced. Potential problems if an extension bypasses the official APIs and directly modifies the PV of an SV: the UTF-8 cache does not get cleared as it should.

APIs obsoleted in Perl 5.8.0s like sv_2pv, sv_catpvn, sv_catsv, sv_setsv, are again available.

Certain Perl core C APIs like cxinc and regatom are no longer available. They never should have been, and if you application depends on them, you should (be ashamed and) contact perl5-porters to discuss what are the proper APIs.

Certain APIs like Perl_list are no longer available without their Perl_ prefix. If your XS module stops working because some functions cannot be found, in many cases a simple fix is to add the Perl_ prefix to the function and the thread context aTHX_ as the first argument of the function call. For cleaner embedding you can also force this for all APIs by defining at compile time the cpp define PERL_NO_SHORT_NAMES.

Perl_save_bool() has been added.

utf8::is_utf8() has been added as a quick way to test whether a scalar is encoded internally in UTF-8 (Unicode).

-DL removed (the leaktest had been broken and unsupported for years, use alternative debugging mallocs or tools like valgrind and Purify).

Verbose modifier v added for -DXv and -Dsv, see perlrun.

New Tests

In Perl 5.8.0 there were about 69000 separate tests in about 700 test files, in Perl 5.8.1 there are about 75000 separate tests in about 750 test files. The exact numbers depend on the Perl configuration and on the operating system platform.

Known Problems

The hash randomisation mentioned in "Incompatible Changes" is definitely problematic: it will wake dormant bugs and shake out bad assumptions.

Many of the rarer platforms that worked 100% or pretty close to it with perl 5.8.0 have been left a little bit untended since their maintainers have been otherwise busy lately, and therefore there will be more failures on those platforms. Such platforms include Mac OS Classic, IBM z/OS (and other EBCDIC platforms), and NetWare. The most common Perl platforms (Unix and Unix-like, Microsoft platforms, and VMS) have large enough testing and expert population that they are doing well.

Platform Specific Problems

(FreeBSD)

The choice of malloc (the C-level memory management interface) when building Perl is problematic in FreeBSD.

Using FreeBSD's system malloc for Perl was found to be very slow: in some cases that was 200 times slower than using the Perl malloc. One such case is file input: for example

    # slurping the whole compressed Perl source code into $a
    if (open F,"perl-5.8.1.tar.gz") { local $/; $a=<F> }

is about 200-250 times slower with the system malloc than with the Perl malloc.

One could use Perl's malloc (Configure -Dusemymalloc), but that was found to cause random core dumps in FreeBSD with multithreaded programs. No such problems were found in other platforms, however.

A decision was made to stick with the system malloc, regardless of the performance problems.

(Win32) sysopen, sysread, syswrite

As of the 5.8.0 release, sysopen()/sysread()/syswrite() do not behave like they used to in 5.6.1 and earlier with respect to "text" mode. These built-ins now always operate in "binary" mode (even if sysopen() was passed the O_TEXT flag, or if binmode() was used on the file handle). Note that this issue should only make a difference for disk files, as sockets and pipes have always been in "binary" mode in the Windows port. As this behavior is currently considered a bug, compatible behavior may be re-introduced in a future release. Until then, the use of sysopen(), sysread() and syswrite() is not supported for "text" mode operations.

EBCDIC Platforms

IBM z/OS and other EBCDIC platforms continue to be problematic regarding Unicode support.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org/ . There may also be information at http://www.perl.com/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.