=head1 NAME
perlfunc - Perl builtin functions
=head1 DESCRIPTION
The functions in this section can serve as terms in an expression.
They fall into two major categories: list operators and named unary
operators. These differ in their precedence relationship
with
a
following comma. (See the precedence table in L<perlop>.) List
operators take more than one argument,
while
unary operators can never
take more than one argument. Thus, a comma terminates the argument of
a unary operator, but merely separates the arguments of a list
operator. A unary operator generally provides a
scalar
context to its
argument,
while
a list operator may provide either
scalar
or list
contexts
for
its arguments. If it does both, the
scalar
arguments will
be first, and the list argument will follow. (Note that there can ever
be only one such list argument.) For instance,
splice
()
has
three
scalar
arguments followed by a list, whereas
gethostbyname
()
has
four
scalar
arguments.
In the syntax descriptions that follow, list operators that expect a
list (and provide list context
for
the elements of the list) are shown
with
LIST as an argument. Such a list may consist of any combination
of
scalar
arguments or list
values
; the list
values
will be included
in the list as
if
each
individual element were interpolated at that
point in the list, forming a longer single-dimensional list value.
Commas should separate elements of the LIST.
Any function in the list below may be used either
with
or without
parentheses
around
its arguments. (The syntax descriptions omit the
parentheses.) If you
use
the parentheses, the simple (but occasionally
surprising) rule is this: It I<looks> like a function, therefore it I<is> a
function, and precedence doesn
't matter. Otherwise it'
s a list
operator or unary operator, and precedence does matter. And whitespace
between the function and left parenthesis doesn't count--so you need to
be careful sometimes:
print
1+2+4;
print
(1+2) + 4;
print
(1+2)+4;
print
+(1+2)+4;
print
((1+2)+4);
If you run Perl
with
the B<-w> switch it can
warn
you about this. For
example, the third line above produces:
print
(...) interpreted as function at - line 1.
Useless
use
of integer addition in void context at - line 1.
A few functions take
no
arguments at all, and therefore work as neither
unary nor list operators. These include such functions as C<
time
>
and C<
endpwent
>. For example, C<
time
+86_400> always means
C<
time
() + 86_400>.
For functions that can be used in either a
scalar
or list context,
nonabortive failure is generally indicated in a
scalar
context by
returning the undefined value, and in a list context by returning the
null list.
Remember the following important rule: There is B<
no
rule> that relates
the behavior of an expression in list context to its behavior in
scalar
context, or vice versa. It might
do
two totally different things.
Each operator and function decides which
sort
of value it would be most
appropriate to
return
in
scalar
context. Some operators
return
the
length
of the list that would have been returned in list context. Some
operators
return
the first value in the list. Some operators
return
the
last
value in the list. Some operators
return
a count of successful
operations. In general, they
do
what you want,
unless
you want
consistency.
A named array in
scalar
context is quite different from what would at
first glance appear to be a list in
scalar
context. You can't get a list
like C<(1,2,3)> into being in
scalar
context, because the compiler knows
the context at compile
time
. It would generate the
scalar
comma operator
there, not the list construction version of the comma. That means it
was never a list to start
with
.
In general, functions in Perl that serve as wrappers
for
system
calls
of the same name (like
chown
(2),
fork
(2),
closedir
(2), etc.) all
return
true
when
they succeed and C<
undef
> otherwise, as is usually mentioned
in the descriptions below. This is different from the C interfaces,
which
return
C<-1> on failure. Exceptions to this rule are C<
wait
>,
C<
waitpid
>, and C<
syscall
>. System calls also set the special C<$!>
variable on failure. Other functions
do
not, except accidentally.
=head2 Perl Functions by Category
Here are Perl's functions (including things that look like
functions, like some keywords and named operators)
arranged by category. Some functions appear in more
than one place.
=over 4
=item Functions
for
SCALARs or strings
C<
chomp
>, C<
chop
>, C<
chr
>, C<
crypt
>, C<
hex
>, C<
index
>, C<
lc
>, C<
lcfirst
>,
C<
length
>, C<
oct
>, C<
ord
>, C<
pack
>, C<
q/STRING/
>, C<
qq/STRING/
>, C<
reverse
>,
C<
rindex
>, C<
sprintf
>, C<
substr
>, C<
tr
///>, C<
uc
>, C<
ucfirst
>, C<y///>
=item Regular expressions and pattern matching
C<m//>, C<
pos
>, C<
quotemeta
>, C<s///>, C<
split
>, C<
study
>, C<
qr//
>
=item Numeric functions
C<
abs
>, C<
atan2
>, C<
cos
>, C<
exp
>, C<
hex
>, C<
int
>, C<
log
>, C<
oct
>, C<
rand
>,
C<
sin
>, C<
sqrt
>, C<
srand
>
=item Functions
for
real
@ARRAYs
C<
pop
>, C<
push
>, C<
shift
>, C<
splice
>, C<
unshift
>
=item Functions
for
list data
C<
grep
>, C<
join
>, C<
map
>, C<
qw/STRING/
>, C<
reverse
>, C<
sort
>, C<
unpack
>
=item Functions
for
real
%HASHes
C<
delete
>, C<
each
>, C<
exists
>, C<
keys
>, C<
values
>
=item Input and output functions
C<
binmode
>, C<
close
>, C<
closedir
>, C<
dbmclose
>, C<
dbmopen
>, C<
die
>, C<
eof
>,
C<
fileno
>, C<
flock
>, C<
format
>, C<
getc
>, C<
print
>, C<
printf
>, C<
read
>,
C<
readdir
>, C<
rewinddir
>, C<
seek
>, C<
seekdir
>, C<
select
>, C<
syscall
>,
C<
sysread
>, C<
sysseek
>, C<
syswrite
>, C<
tell
>, C<
telldir
>, C<
truncate
>,
C<
warn
>, C<
write
>
=item Functions
for
fixed
length
data or records
C<
pack
>, C<
read
>, C<
syscall
>, C<
sysread
>, C<
syswrite
>, C<
unpack
>, C<
vec
>
=item Functions
for
filehandles, files, or directories
C<-I<X>>, C<
chdir
>, C<
chmod
>, C<
chown
>, C<
chroot
>, C<
fcntl
>, C<
glob
>,
C<
ioctl
>, C<
link
>, C<
lstat
>, C<
mkdir
>, C<
open
>, C<
opendir
>,
C<
readlink
>, C<
rename
>, C<
rmdir
>, C<
stat
>, C<
symlink
>, C<
sysopen
>,
C<
umask
>, C<
unlink
>, C<
utime
>
=item Keywords related to the control flow of your Perl program
C<
caller
>, C<
continue
>, C<
die
>, C<
do
>, C<
dump
>, C<
eval
>, C<
exit
>,
C<
goto
>, C<
last
>, C<
next
>, C<
redo
>, C<
return
>, C<
sub
>, C<
wantarray
>
=item Keywords related to scoping
C<
caller
>, C<
import
>, C<
local
>, C<
my
>, C<
our
>, C<
package
>, C<
use
>
=item Miscellaneous functions
C<
defined
>, C<
dump
>, C<
eval
>, C<
formline
>, C<
local
>, C<
my
>, C<
our
>, C<
reset
>,
C<
scalar
>, C<
undef
>, C<
wantarray
>
=item Functions
for
processes and process groups
C<
alarm
>, C<
exec
>, C<
fork
>, C<
getpgrp
>, C<
getppid
>, C<
getpriority
>, C<
kill
>,
C<
pipe
>, C<
qx/STRING/
>, C<
setpgrp
>, C<
setpriority
>, C<
sleep
>, C<
system
>,
C<
times
>, C<
wait
>, C<
waitpid
>
=item Keywords related to perl modules
C<
do
>, C<
import
>, C<
no
>, C<
package
>, C<
require
>, C<
use
>
=item Keywords related to classes and object-orientedness
C<
bless
>, C<
dbmclose
>, C<
dbmopen
>, C<
package
>, C<
ref
>, C<
tie
>, C<
tied
>,
C<
untie
>, C<
use
>
=item Low-level
socket
functions
C<
accept
>, C<
bind
>, C<
connect
>, C<
getpeername
>, C<
getsockname
>,
C<
getsockopt
>, C<
listen
>, C<
recv
>, C<
send
>, C<
setsockopt
>, C<
shutdown
>,
C<
socket
>, C<
socketpair
>
=item System V interprocess communication functions
C<
msgctl
>, C<
msgget
>, C<
msgrcv
>, C<
msgsnd
>, C<
semctl
>, C<
semget
>, C<
semop
>,
C<
shmctl
>, C<
shmget
>, C<
shmread
>, C<
shmwrite
>
=item Fetching user and group info
C<
endgrent
>, C<
endhostent
>, C<
endnetent
>, C<
endpwent
>, C<
getgrent
>,
C<
getgrgid
>, C<
getgrnam
>, C<
getlogin
>, C<
getpwent
>, C<
getpwnam
>,
C<
getpwuid
>, C<
setgrent
>, C<
setpwent
>
=item Fetching network info
C<
endprotoent
>, C<
endservent
>, C<
gethostbyaddr
>, C<
gethostbyname
>,
C<
gethostent
>, C<
getnetbyaddr
>, C<
getnetbyname
>, C<
getnetent
>,
C<
getprotobyname
>, C<
getprotobynumber
>, C<
getprotoent
>,
C<
getservbyname
>, C<
getservbyport
>, C<
getservent
>, C<
sethostent
>,
C<
setnetent
>, C<
setprotoent
>, C<
setservent
>
=item Time-related functions
C<
gmtime
>, C<
localtime
>, C<
time
>, C<
times
>
=item Functions new in perl5
C<
abs
>, C<
bless
>, C<
chomp
>, C<
chr
>, C<
exists
>, C<
formline
>, C<
glob
>,
C<
import
>, C<
lc
>, C<
lcfirst
>, C<
map
>, C<
my
>, C<
no
>, C<
our
>, C<
prototype
>,
C<
qx>, C&
lt;
qw>, C&
lt;
readline
>, C<
readpipe
>, C<
ref
>, C<
sub
*>, C<
sysopen
>, C<
tie
>,
C<
tied
>, C<
uc
>, C<
ucfirst
>, C<
untie
>, C<
use
>
* - C<
sub
> was a keyword in perl4, but in perl5 it is an
operator, which can be used in expressions.
=item Functions obsoleted in perl5
C<
dbmclose
>, C<
dbmopen
>
=back
=head2 Portability
Perl was born in Unix and can therefore access all common Unix
system
calls. In non-Unix environments, the functionality of some
Unix
system
calls may not be available, or details of the available
functionality may differ slightly. The Perl functions affected
by this are:
C<-X>, C<
binmode
>, C<
chmod
>, C<
chown
>, C<
chroot
>, C<
crypt
>,
C<
dbmclose
>, C<
dbmopen
>, C<
dump
>, C<
endgrent
>, C<
endhostent
>,
C<
endnetent
>, C<
endprotoent
>, C<
endpwent
>, C<
endservent
>, C<
exec
>,
C<
fcntl
>, C<
flock
>, C<
fork
>, C<
getgrent
>, C<
getgrgid
>, C<
gethostbyname
>,
C<
gethostent
>, C<
getlogin
>, C<
getnetbyaddr
>, C<
getnetbyname
>, C<
getnetent
>,
C<
getppid
>, C<
getpgrp
>, C<
getpriority
>, C<
getprotobynumber
>,
C<
getprotoent
>, C<
getpwent
>, C<
getpwnam
>, C<
getpwuid
>,
C<
getservbyport
>, C<
getservent
>, C<
getsockopt
>, C<
glob
>, C<
ioctl
>,
C<
kill
>, C<
link
>, C<
lstat
>, C<
msgctl
>, C<
msgget
>, C<
msgrcv
>,
C<
msgsnd
>, C<
open
>, C<
pipe
>, C<
readlink
>, C<
rename
>, C<
select
>, C<
semctl
>,
C<
semget
>, C<
semop
>, C<
setgrent
>, C<
sethostent
>, C<
setnetent
>,
C<
setpgrp
>, C<
setpriority
>, C<
setprotoent
>, C<
setpwent
>,
C<
setservent
>, C<
setsockopt
>, C<
shmctl
>, C<
shmget
>, C<
shmread
>,
C<
shmwrite
>, C<
socket
>, C<
socketpair
>,
C<
stat
>, C<
symlink
>, C<
syscall
>, C<
sysopen
>, C<
system
>,
C<
times
>, C<
truncate
>, C<
umask
>, C<
unlink
>,
C<
utime
>, C<
wait
>, C<
waitpid
>
For more information about the portability of these functions, see
L<perlport> and other available platform-specific documentation.
=head2 Alphabetical Listing of Perl Functions
=over 8
=item -X FILEHANDLE
=item -X EXPR
=item -X
A file test, where X is one of the letters listed below. This unary
operator takes one argument, either a filename or a filehandle, and
tests the associated file to see
if
something is true about it. If the
argument is omitted, tests C<
$_
>, except
for
C<-t>, which tests STDIN.
Unless otherwise documented, it returns C<1>
for
true and C<
''
>
for
false, or
the undefined value
if
the file doesn't exist. Despite the funny
names, precedence is the same as any other named unary operator, and
the argument may be parenthesized like any other unary operator. The
operator may be any of:
X<-r>X<-w>X<-x>X<-o>X<-R>X<-W>X<-X>X<-O>X<-e>X<-z>X<-s>X<-f>X<-d>X<-l>X<-p>
X<-S>X<-b>X<-c>X<-t>X<-u>X<-g>X<-k>X<-T>X<-B>X<-M>X<-A>X<-C>
-r File is readable by effective uid/gid.
-w File is writable by effective uid/gid.
-x File is executable by effective uid/gid.
-o File is owned by effective uid.
-R File is readable by real uid/gid.
-W File is writable by real uid/gid.
-X File is executable by real uid/gid.
-O File is owned by real uid.
-e File
exists
.
-z File
has
zero size (is empty).
-s File
has
nonzero size (returns size in bytes).
-f File is a plain file.
-d File is a directory.
-l File is a symbolic
link
.
-p File is a named
pipe
(FIFO), or Filehandle is a
pipe
.
-S File is a
socket
.
-b File is a block special file.
-c File is a character special file.
-t Filehandle is opened to a tty.
-u File
has
setuid bit set.
-g File
has
setgid bit set.
-k File
has
sticky bit set.
-T File is an ASCII text file (heuristic guess).
-B File is a
"binary"
file (opposite of -T).
-M Script start
time
minus file modification
time
, in days.
-A Same
for
access
time
.
-C Same
for
inode change
time
(Unix, may differ
for
other platforms)
Example:
while
(<>) {
chomp
;
next
unless
-f
$_
;
}
The interpretation of the file permission operators C<-r>, C<-R>,
C<-w>, C<-W>, C<-x>, and C<-X> is by
default
based solely on the mode
of the file and the uids and gids of the user. There may be other
reasons you can't actually
read
,
write
, or execute the file. Such
reasons may be
for
example network filesystem access controls, ACLs
(access control lists),
read
-only filesystems, and unrecognized
executable formats.
Also note that,
for
the superuser on the
local
filesystems, the C<-r>,
C<-R>, C<-w>, and C<-W> tests always
return
1, and C<-x> and C<-X>
return
1
if
any execute bit is set in the mode. Scripts run by the superuser
may thus need to
do
a
stat
() to determine the actual mode of the file,
or temporarily set their effective uid to something
else
.
If you are using ACLs, there is a pragma called C<filetest> that may
produce more accurate results than the bare
stat
() mode bits.
When under the C<
use
filetest
'access'
> the above-mentioned filetests
will test whether the permission can (not) be granted using the
access() family of
system
calls. Also note that the C<-x> and C<-X> may
under this pragma
return
true even
if
there are
no
execute permission
bits set (nor any extra execute permission ACLs). This strangeness is
due to the underlying
system
calls' definitions. Read the
documentation
for
the C<filetest> pragma
for
more information.
Note that C<-s/a/b/> does not
do
a negated substitution. Saying
C<-
exp
(
$foo
)> still works as expected, however--only single letters
following a minus are interpreted as file tests.
The C<-T> and C<-B> switches work as follows. The first block or so of the
file is examined
for
odd characters such as strange control codes or
characters
with
the high bit set. If too many strange characters (>30%)
are found, it
's a C<-B> file; otherwise it'
s a C<-T> file. Also, any file
containing null in the first block is considered a binary file. If C<-T>
or C<-B> is used on a filehandle, the current IO buffer is examined
rather than the first block. Both C<-T> and C<-B>
return
true on a null
file, or a file at EOF
when
testing a filehandle. Because you have to
read
a file to
do
the C<-T> test, on most occasions you want to
use
a C<-f>
against the file first, as in C<
next
unless
-f
$file
&& -T
$file
>.
If any of the file tests (or either the C<
stat
> or C<
lstat
> operators) are
given
the special filehandle consisting of a solitary underline, then the
stat
structure of the previous file test (or
stat
operator) is used, saving
a
system
call. (This doesn't work
with
C<-t>, and you need to remember
that
lstat
() and C<-l> will leave
values
in the
stat
structure
for
the
symbolic
link
, not the real file.) (Also,
if
the
stat
buffer was filled by
an C<
lstat
> call, C<-T> and C<-B> will
reset
it
with
the results of C<
stat
_>).
Example:
print
"Can do.\n"
if
-r
$a
|| -w _ || -x _;
stat
(
$filename
);
print
"Readable\n"
if
-r _;
print
"Writable\n"
if
-w _;
print
"Executable\n"
if
-x _;
print
"Setuid\n"
if
-u _;
print
"Setgid\n"
if
-g _;
print
"Sticky\n"
if
-k _;
print
"Text\n"
if
-T _;
print
"Binary\n"
if
-B _;
=item
abs
VALUE
=item
abs
Returns the absolute value of its argument.
If VALUE is omitted, uses C<
$_
>.
=item
accept
NEWSOCKET,GENERICSOCKET
Accepts an incoming
socket
connect
, just as the
accept
(2)
system
call
does. Returns the packed address
if
it succeeded, false otherwise.
See the example in L<perlipc/
"Sockets: Client/Server Communication"
>.
On systems that support a
close
-on-
exec
flag on files, the flag will
be set
for
the newly opened file descriptor, as determined by the
value of $^F. See L<perlvar/$^F>.
=item
alarm
SECONDS
=item
alarm
Arranges to have a SIGALRM delivered to this process
after
the
specified number of wallclock seconds
has
elapsed. If SECONDS is not
specified, the value stored in C<
$_
> is used. (On some machines,
unfortunately, the elapsed
time
may be up to one second less or more
than you specified because of how seconds are counted, and process
scheduling may delay the delivery of the signal even further.)
Only one timer may be counting at once. Each call disables the
previous timer, and an argument of C<0> may be supplied to cancel the
previous timer without starting a new one. The returned value is the
amount of
time
remaining on the previous timer.
For delays of finer granularity than one second, you may
use
Perl's
four-argument version of
select
() leaving the first three arguments
undefined, or you might be able to
use
the C<
syscall
> interface to
access setitimer(2)
if
your
system
supports it. The Time::HiRes
module (from CPAN, and starting from Perl 5.8 part of the standard
distribution) may also prove useful.
It is usually a mistake to intermix C<
alarm
> and C<
sleep
> calls.
(C<
sleep
> may be internally implemented in your
system
with
C<
alarm
>)
If you want to
use
C<
alarm
> to
time
out a
system
call you need to
use
an
C<
eval
>/C<
die
> pair. You can't rely on the
alarm
causing the
system
call to
fail
with
C<$!> set to C<EINTR> because Perl sets up signal handlers to
restart
system
calls on some systems. Using C<
eval
>/C<
die
> always works,
modulo the caveats
given
in L<perlipc/
"Signals"
>.
eval
{
local
$SIG
{ALRM} =
sub
{
die
"alarm\n"
};
alarm
$timeout
;
$nread
=
sysread
SOCKET,
$buffer
,
$size
;
alarm
0;
};
if
($@) {
die
unless
$@ eq
"alarm\n"
;
}
else
{
}
For more information see L<perlipc>.
=item
atan2
Y,X
Returns the arctangent of Y/X in the range -PI to PI.
For the tangent operation, you may
use
the C<Math::Trig::tan>
function, or
use
the familiar relation:
sub
tan {
sin
(
$_
[0]) /
cos
(
$_
[0]) }
=item
bind
SOCKET,NAME
Binds a network address to a
socket
, just as the
bind
system
call
does. Returns true
if
it succeeded, false otherwise. NAME should be a
packed address of the appropriate type
for
the
socket
. See the examples in
L<perlipc/
"Sockets: Client/Server Communication"
>.
=item
binmode
FILEHANDLE, LAYER
=item
binmode
FILEHANDLE
Arranges
for
FILEHANDLE to be
read
or written in
"binary"
or
"text"
mode on systems where the run-
time
libraries distinguish between
binary and text files. If FILEHANDLE is an expression, the value is
taken as the name of the filehandle. Returns true on success,
otherwise it returns C<
undef
> and sets C<$!> (errno).
On some systems (in general, DOS and Windows-based systems)
binmode
()
is necessary
when
you're not working
with
a text file. For the sake
of portability it is a good idea to always
use
it
when
appropriate,
and to never
use
it
when
it isn't appropriate. Also, people can
set their I/O to be by
default
UTF-8 encoded Unicode, not bytes.
In other words: regardless of platform,
use
binmode
() on binary data,
like
for
example images.
If LAYER is present it is a single string, but may contain multiple
directives. The directives alter the behaviour of the file handle.
When LAYER is present using
binmode
on text file makes sense.
If LAYER is omitted or specified as C<:raw> the filehandle is made
suitable
for
passing binary data. This includes turning off possible CRLF
translation and marking it as bytes (as opposed to Unicode characters).
Note that, despite what may be implied in I<
"Programming Perl"
> (the
Camel) or elsewhere, C<:raw> is I<not> the simply inverse of C<:crlf>
-- other layers which would affect binary nature of the stream are
I<also> disabled. See L<PerlIO>, L<perlrun> and the discussion about the
PERLIO environment variable.
The C<:bytes>, C<:crlf>, and C<:utf8>, and any other directives of the
form C<:...>, are called I/O I<layers>. The C<
open
> pragma can be used to
establish
default
I/O layers. See L<
open
>.
I<The LAYER parameter of the
binmode
() function is described as
"DISCIPLINE"
in
"Programming Perl, 3rd Edition"
. However, since the publishing of this
book, by many known as
"Camel III"
, the consensus of the naming of this
functionality
has
moved from
"discipline"
to
"layer"
. All documentation
of this version of Perl therefore refers to
"layers"
rather than to
"disciplines"
. Now back to the regularly scheduled documentation...>
To mark FILEHANDLE as UTF-8,
use
C<:utf8>.
In general,
binmode
() should be called
after
open
() but
before
any I/O
is done on the filehandle. Calling
binmode
() will normally flush any
pending buffered output data (and perhaps pending input data) on the
handle. An exception to this is the C<:encoding> layer that
changes the
default
character encoding of the handle, see L<
open
>.
The C<:encoding> layer sometimes needs to be called in
mid-stream, and it doesn't flush the stream. The C<:encoding>
also implicitly pushes on top of itself the C<:utf8> layer because
internally Perl will operate on UTF-8 encoded Unicode characters.
The operating
system
, device drivers, C libraries, and Perl run-
time
system
all work together to let the programmer treat a single
character (C<\n>) as the line terminator, irrespective of the external
representation. On many operating systems, the native text file
representation matches the internal representation, but on some
platforms the external representation of C<\n> is made up of more than
one character.
Mac OS, all variants of Unix, and Stream_LF files on VMS
use
a single
character to end
each
line in the external representation of text (even
though that single character is CARRIAGE RETURN on Mac OS and LINE FEED
on Unix and most VMS files). In other systems like OS/2, DOS and the
various flavors of MS-Windows your program sees a C<\n> as a simple C<\cJ>,
but what's stored in text files are the two characters C<\cM\cJ>. That
means that,
if
you don't
use
binmode
() on these systems, C<\cM\cJ>
sequences on disk will be converted to C<\n> on input, and any C<\n> in
your program will be converted back to C<\cM\cJ> on output. This is what
you want
for
text files, but it can be disastrous
for
binary files.
Another consequence of using
binmode
() (on some systems) is that
special end-of-file markers will be seen as part of the data stream.
For systems from the Microsoft family this means that
if
your binary
data contains C<\cZ>, the I/O subsystem will regard it as the end of
the file,
unless
you
use
binmode
().
binmode
() is not only important
for
readline
() and
print
() operations,
but also
when
using
read
(),
seek
(),
sysread
(),
syswrite
() and
tell
()
(see L<perlport>
for
more details). See the C<$/> and C<$\> variables
in L<perlvar>
for
how to manually set your input and output
line-termination sequences.
=item
bless
REF,CLASSNAME
=item
bless
REF
This function tells the thingy referenced by REF that it is now an object
in the CLASSNAME
package
. If CLASSNAME is omitted, the current
package
is used. Because a C<
bless
> is often the
last
thing in a constructor,
it returns the reference
for
convenience. Always
use
the two-argument
version
if
a derived class might inherit the function doing the blessing.
See L<perltoot> and L<perlobj>
for
more about the blessing (and blessings)
of objects.
Consider always blessing objects in CLASSNAMEs that are mixed case.
Namespaces
with
all lowercase names are considered reserved
for
Perl pragmata. Builtin types have all uppercase names. To prevent
confusion, you may wish to avoid such
package
names as well. Make sure
that CLASSNAME is a true value.
See L<perlmod/
"Perl Modules"
>.
=item
caller
EXPR
=item
caller
Returns the context of the current subroutine call. In
scalar
context,
returns the
caller
's
package
name
if
there is a
caller
, that is,
if
we're in a subroutine or C<
eval
> or C<
require
>, and the undefined value
otherwise. In list context, returns
(
$package
,
$filename
,
$line
) =
caller
;
With EXPR, it returns some extra information that the debugger uses to
print
a stack trace. The value of EXPR indicates how many call frames
to go back
before
the current one.
(
$package
,
$filename
,
$line
,
$subroutine
,
$hasargs
,
$wantarray
,
$evaltext
,
$is_require
,
$hints
,
$bitmask
) =
caller
(
$i
);
Here
$subroutine
may be C<(
eval
)>
if
the frame is not a subroutine
call, but an C<
eval
>. In such a case additional elements
$evaltext
and
C<
$is_require
> are set: C<
$is_require
> is true
if
the frame is created by a
C<
require
> or C<
use
> statement,
$evaltext
contains the text of the
C<
eval
EXPR> statement. In particular,
for
an C<
eval
BLOCK> statement,
$filename
is C<(
eval
)>, but
$evaltext
is undefined. (Note also that
each
C<
use
> statement creates a C<
require
> frame inside an C<
eval
EXPR>
frame.)
$subroutine
may also be C<(unknown)>
if
this particular
subroutine happens to have been deleted from the symbol table.
C<
$hasargs
> is true
if
a new instance of C<
@_
> was set up
for
the frame.
C<
$hints
> and C<
$bitmask
> contain pragmatic hints that the
caller
was
compiled
with
. The C<
$hints
> and C<
$bitmask
>
values
are subject to change
between versions of Perl, and are not meant
for
external
use
.
Furthermore,
when
called from within the DB
package
,
caller
returns more
detailed information: it sets the list variable C<
@DB::args
> to be the
arguments
with
which the subroutine was invoked.
Be aware that the optimizer might have optimized call frames away
before
C<
caller
> had a chance to get the information. That means that C<
caller
(N)>
might not
return
information about the call frame you expect it
do
,
for
C<< N > 1 >>. In particular, C<
@DB::args
> might have information from the
previous
time
C<
caller
> was called.
=item
chdir
EXPR
=item
chdir
FILEHANDLE
=item
chdir
DIRHANDLE
=item
chdir
Changes the working directory to EXPR,
if
possible. If EXPR is omitted,
changes to the directory specified by C<
$ENV
{HOME}>,
if
set;
if
not,
changes to the directory specified by C<
$ENV
{LOGDIR}>. (Under VMS, the
variable C<
$ENV
{SYS
$LOGIN
}> is also checked, and used
if
it is set.) If
neither is set, C<
chdir
> does nothing. It returns true upon success,
false otherwise. See the example under C<
die
>.
On systems that support fchdir, you might pass a file handle or
directory handle as argument. On systems that don't support fchdir,
passing handles produces a fatal error at run
time
.
=item
chmod
LIST
Changes the permissions of a list of files. The first element of the
list must be the numerical mode, which should probably be an octal
number, and which definitely should I<not> be a string of octal digits:
C<0644> is okay, C<
'0644'
> is not. Returns the number of files
successfully changed. See also L</
oct
>,
if
all you have is a string.
$cnt
=
chmod
0755,
'foo'
,
'bar'
;
chmod
0755,
@executables
;
$mode
=
'0644'
;
chmod
$mode
,
'foo'
;
$mode
=
'0644'
;
chmod
oct
(
$mode
),
'foo'
;
$mode
= 0644;
chmod
$mode
,
'foo'
;
On systems that support fchmod, you might pass file handles among the
files. On systems that don't support fchmod, passing file handles
produces a fatal error at run
time
.
open
(
my
$fh
,
"<"
,
"foo"
);
my
$perm
= (
stat
$fh
)[2] & 07777;
chmod
(
$perm
| 0600,
$fh
);
You can also
import
the symbolic C<S_I*> constants from the Fcntl
module:
chmod
S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH,
@executables
;
=item
chomp
VARIABLE
=item
chomp
( LIST )
=item
chomp
This safer version of L</
chop
> removes any trailing string
that corresponds to the current value of C<$/> (also known as
$INPUT_RECORD_SEPARATOR
in the C<English> module). It returns the total
number of characters removed from all its arguments. It's often used to
remove the newline from the end of an input record
when
you're worried
that the final record may be missing its newline. When in paragraph
mode (C<$/ =
""
>), it removes all trailing newlines from the string.
When in slurp mode (C<$/ =
undef
>) or fixed-
length
record mode (C<$/> is
a reference to an integer or the like, see L<perlvar>)
chomp
() won't
remove anything.
If VARIABLE is omitted, it chomps C<
$_
>. Example:
while
(<>) {
chomp
;
@array
=
split
(/:/);
}
If VARIABLE is a hash, it chomps the hash's
values
, but not its
keys
.
You can actually
chomp
anything that's an lvalue, including an assignment:
chomp
(
$cwd
= `pwd`);
chomp
(
$answer
= <STDIN>);
If you
chomp
a list,
each
element is chomped, and the total number of
characters removed is returned.
If the C<encoding> pragma is in scope then the lengths returned are
calculated from the
length
of C<$/> in Unicode characters, which is not
always the same as the
length
of C<$/> in the native encoding.
Note that parentheses are necessary
when
you're chomping anything
that is not a simple variable. This is because C<
chomp
$cwd
= `pwd`;>
is interpreted as C<(
chomp
$cwd
) = `pwd`;>, rather than as
C<
chomp
(
$cwd
= `pwd` )> which you might expect. Similarly,
C<
chomp
$a
,
$b
> is interpreted as C<
chomp
(
$a
),
$b
> rather than
as C<
chomp
(
$a
,
$b
)>.
=item
chop
VARIABLE
=item
chop
( LIST )
=item
chop
Chops off the
last
character of a string and returns the character
chopped. It is much more efficient than C<s/.$//s> because it neither
scans nor copies the string. If VARIABLE is omitted, chops C<
$_
>.
If VARIABLE is a hash, it chops the hash's
values
, but not its
keys
.
You can actually
chop
anything that's an lvalue, including an assignment.
If you
chop
a list,
each
element is chopped. Only the value of the
last
C<
chop
> is returned.
Note that C<
chop
> returns the
last
character. To
return
all but the
last
character,
use
C<
substr
(
$string
, 0, -1)>.
See also L</
chomp
>.
=item
chown
LIST
Changes the owner (and group) of a list of files. The first two
elements of the list must be the I<numeric> uid and gid, in that
order. A value of -1 in either position is interpreted by most
systems to leave that value unchanged. Returns the number of files
successfully changed.
$cnt
=
chown
$uid
,
$gid
,
'foo'
,
'bar'
;
chown
$uid
,
$gid
,
@filenames
;
On systems that support fchown, you might pass file handles among the
files. On systems that don't support fchown, passing file handles
produces a fatal error at run
time
.
Here's an example that looks up nonnumeric uids in the passwd file:
print
"User: "
;
chomp
(
$user
= <STDIN>);
print
"Files: "
;
chomp
(
$pattern
= <STDIN>);
(
$login
,
$pass
,
$uid
,
$gid
) =
getpwnam
(
$user
)
or
die
"$user not in passwd file"
;
@ary
=
glob
(
$pattern
);
chown
$uid
,
$gid
,
@ary
;
On most systems, you are not allowed to change the ownership of the
file
unless
you're the superuser, although you should be able to change
the group to any of your secondary groups. On insecure systems, these
restrictions may be relaxed, but this is not a portable assumption.
On POSIX systems, you can detect this condition this way:
use
POSIX
qw(sysconf _PC_CHOWN_RESTRICTED)
;
$can_chown_giveaway
= not sysconf(_PC_CHOWN_RESTRICTED);
=item
chr
NUMBER
=item
chr
Returns the character represented by that NUMBER in the character set.
For example, C<
chr
(65)> is C<
"A"
> in either ASCII or Unicode, and
chr
(0x263a) is a Unicode smiley face. Note that characters from 128
to 255 (inclusive) are by
default
not encoded in UTF-8 Unicode
for
backward compatibility reasons (but see L<encoding>).
If NUMBER is omitted, uses C<
$_
>.
For the
reverse
,
use
L</
ord
>.
Note that under the C<bytes> pragma the NUMBER is masked to
the low eight bits.
See L<perlunicode> and L<encoding>
for
more about Unicode.
=item
chroot
FILENAME
=item
chroot
This function works like the
system
call by the same name: it makes the
named directory the new root directory
for
all further pathnames that
begin
with
a C</> by your process and all its children. (It doesn't
change your current working directory, which is unaffected.) For security
reasons, this call is restricted to the superuser. If FILENAME is
omitted, does a C<
chroot
> to C<
$_
>.
=item
close
FILEHANDLE
=item
close
Closes the file or
pipe
associated
with
the file handle, returning
true only
if
IO buffers are successfully flushed and closes the
system
file descriptor. Closes the currently selected filehandle
if
the
argument is omitted.
You don't have to
close
FILEHANDLE
if
you are immediately going to
do
another C<
open
> on it, because C<
open
> will
close
it
for
you. (See
C<
open
>.) However, an explicit C<
close
> on an input file resets the line
counter (C<$.>),
while
the implicit
close
done by C<
open
> does not.
If the file handle came from a piped
open
, C<
close
> will additionally
return
false
if
one of the other
system
calls involved fails, or
if
the
program exits
with
non-zero status. (If the only problem was that the
program exited non-zero, C<$!> will be set to C<0>.) Closing a
pipe
also waits
for
the process executing on the
pipe
to complete, in case you
want to look at the output of the
pipe
afterwards, and
implicitly puts the
exit
status value of that command into C<$?>.
Prematurely closing the
read
end of a
pipe
(i.e.
before
the process
writing to it at the other end
has
closed it) will result in a
SIGPIPE being delivered to the writer. If the other end can't
handle that, be sure to
read
all the data
before
closing the
pipe
.
Example:
open
(OUTPUT,
'|sort >foo'
)
or
die
"Can't start sort: $!"
;
close
OUTPUT
or
warn
$! ?
"Error closing sort pipe: $!"
:
"Exit status $? from sort"
;
open
(INPUT,
'foo'
)
or
die
"Can't open 'foo' for input: $!"
;
FILEHANDLE may be an expression whose value can be used as an indirect
filehandle, usually the real filehandle name.
=item
closedir
DIRHANDLE
Closes a directory opened by C<
opendir
> and returns the success of that
system
call.
=item
connect
SOCKET,NAME
Attempts to
connect
to a remote
socket
, just as the
connect
system
call
does. Returns true
if
it succeeded, false otherwise. NAME should be a
packed address of the appropriate type
for
the
socket
. See the examples in
L<perlipc/
"Sockets: Client/Server Communication"
>.
=item
continue
BLOCK
C<
continue
> is actually a flow control statement rather than a function. If
there is a C<
continue
> BLOCK attached to a BLOCK (typically in a C<
while
> or
C<
foreach
>), it is always executed just
before
the conditional is about to
be evaluated again, just like the third part of a C<
for
> loop in C. Thus
it can be used to increment a loop variable, even
when
the loop
has
been
continued via the C<
next
> statement (which is similar to the C C<
continue
>
statement).
C<
last
>, C<
next
>, or C<
redo
> may appear within a C<
continue
>
block. C<
last
> and C<
redo
> will behave as
if
they had been executed within
the main block. So will C<
next
>, but since it will execute a C<
continue
>
block, it may be more entertaining.
while
(EXPR) {
do_something;
}
continue
{
do_something_else;
}
Omitting the C<
continue
> section is semantically equivalent to using an
empty one, logically enough. In that case, C<
next
> goes directly back
to check the condition at the top of the loop.
=item
cos
EXPR
=item
cos
Returns the cosine of EXPR (expressed in radians). If EXPR is omitted,
takes cosine of C<
$_
>.
For the inverse cosine operation, you may
use
the C<Math::Trig::acos()>
function, or
use
this relation:
sub
acos {
atan2
(
sqrt
(1 -
$_
[0] *
$_
[0]),
$_
[0] ) }
=item
crypt
PLAINTEXT,SALT
Creates a digest string exactly like the
crypt
(3) function in the C
library (assuming that you actually have a version there that
has
not
been extirpated as a potential munitions).
crypt
() is a one-way hash function. The PLAINTEXT and SALT is turned
into a short string, called a digest, which is returned. The same
PLAINTEXT and SALT will always
return
the same string, but there is
no
(known) way to get the original PLAINTEXT from the hash. Small
changes in the PLAINTEXT or SALT will result in large changes in the
digest.
There is
no
decrypt function. This function isn't all that useful
for
cryptography (
for
that, look
for
F<Crypt> modules on your nearby CPAN
mirror) and the name
"crypt"
is a bit of a misnomer. Instead it is
primarily used to check
if
two pieces of text are the same without
having to transmit or store the text itself. An example is checking
if
a correct password is
given
. The digest of the password is stored,
not the password itself. The user types in a password that is
crypt
()'d
with
the same salt as the stored digest. If the two digests
match the password is correct.
When verifying an existing digest string you should
use
the digest as
the salt (like C<
crypt
(
$plain
,
$digest
) eq
$digest
>). The SALT used
to create the digest is visible as part of the digest. This ensures
crypt
() will hash the new string
with
the same salt as the digest.
This allows your code to work
with
the standard L<
crypt
|/
crypt
> and
with
more exotic implementations. In other words,
do
not assume
anything about the returned string itself, or how many bytes in the
digest matter.
Traditionally the result is a string of 13 bytes: two first bytes of
the salt, followed by 11 bytes from the set C<[./0-9A-Za-z]>, and only
the first eight bytes of the digest string mattered, but alternative
hashing schemes (like MD5), higher level security schemes (like C2),
and implementations on non-UNIX platforms may produce different
strings.
When choosing a new salt create a random two character string whose
characters come from the set C<[./0-9A-Za-z]> (like C<
join
''
, (
'.'
,
'/'
, 0..9,
'A'
..
'Z'
,
'a'
..
'z'
)[
rand
64,
rand
64]>). This set of
characters is just a recommendation; the characters allowed in
the salt depend solely on your
system
's crypt library, and Perl can'
t
restrict what salts C<
crypt
()> accepts.
Here's an example that makes sure that whoever runs this program knows
their password:
$pwd
= (
getpwuid
($<))[1];
system
"stty -echo"
;
print
"Password: "
;
chomp
(
$word
= <STDIN>);
print
"\n"
;
system
"stty echo"
;
if
(
crypt
(
$word
,
$pwd
) ne
$pwd
) {
die
"Sorry...\n"
;
}
else
{
print
"ok\n"
;
}
Of course, typing in your own password to whoever asks you
for
it is unwise.
The L<
crypt
|/
crypt
> function is unsuitable
for
hashing large quantities
of data, not least of all because you can't get the information
back. Look at the L<Digest> module
for
more robust algorithms.
If using
crypt
() on a Unicode string (which I<potentially>
has
characters
with
codepoints above 255), Perl tries to make sense
of the situation by trying to downgrade (a copy of the string)
the string back to an eight-bit byte string
before
calling
crypt
()
(on that copy). If that works, good. If not,
crypt
() dies
with
C<Wide character in
crypt
>.
=item
dbmclose
HASH
[This function
has
been largely superseded by the C<
untie
> function.]
Breaks the binding between a DBM file and a hash.
=item
dbmopen
HASH,DBNAME,MASK
[This function
has
been largely superseded by the C<
tie
> function.]
This binds a dbm(3), ndbm(3), sdbm(3), gdbm(3), or Berkeley DB file to a
hash. HASH is the name of the hash. (Unlike normal C<
open
>, the first
argument is I<not> a filehandle, even though it looks like one). DBNAME
is the name of the database (without the F<.dir> or F<.pag> extension
if
any). If the database does not exist, it is created
with
protection
specified by MASK (as modified by the C<
umask
>). If your
system
supports
only the older DBM functions, you may perform only one C<
dbmopen
> in your
program. In older versions of Perl,
if
your
system
had neither DBM nor
ndbm, calling C<
dbmopen
> produced a fatal error; it now falls back to
sdbm(3).
If you don't have
write
access to the DBM file, you can only
read
hash
variables, not set them. If you want to test whether you can
write
,
either
use
file tests or
try
setting a dummy hash entry inside an C<
eval
>,
which will trap the error.
Note that functions such as C<
keys
> and C<
values
> may
return
huge lists
when
used on large DBM files. You may prefer to
use
the C<
each
>
function to iterate over large DBM files. Example:
dbmopen
(
%HIST
,
'/usr/lib/news/history'
,0666);
while
((
$key
,
$val
) =
each
%HIST
) {
print
$key
,
' = '
,
unpack
(
'L'
,
$val
),
"\n"
;
}
dbmclose
(
%HIST
);
See also L<AnyDBM_File>
for
a more general description of the pros and
cons of the various dbm approaches, as well as L<DB_File>
for
a particularly
rich implementation.
You can control which DBM library you
use
by loading that library
before
you call
dbmopen
():
dbmopen
(
%NS_Hist
,
"$ENV{HOME}/.netscape/history.db"
)
or
die
"Can't open netscape history file: $!"
;
=item
defined
EXPR
=item
defined
Returns a Boolean value telling whether EXPR
has
a value other than
the undefined value C<
undef
>. If EXPR is not present, C<
$_
> will be
checked.
Many operations
return
C<
undef
> to indicate failure, end of file,
system
error, uninitialized variable, and other exceptional
conditions. This function allows you to distinguish C<
undef
> from
other
values
. (A simple Boolean test will not distinguish among
C<
undef
>, zero, the empty string, and C<
"0"
>, which are all equally
false.) Note that since C<
undef
> is a valid
scalar
, its presence
doesn't I<necessarily> indicate an exceptional condition: C<
pop
>
returns C<
undef
>
when
its argument is an empty array, I<or>
when
the
element to
return
happens to be C<
undef
>.
You may also
use
C<
defined
(
&func
)> to check whether subroutine C<
&func
>
has
ever been
defined
. The
return
value is unaffected by any forward
declarations of C<
&func
>. Note that a subroutine which is not
defined
may still be callable: its
package
may have an C<AUTOLOAD> method that
makes it spring into existence the first
time
that it is called -- see
L<perlsub>.
Use of C<
defined
> on aggregates (hashes and arrays) is deprecated. It
used to report whether memory
for
that aggregate
has
ever been
allocated. This behavior may disappear in future versions of Perl.
You should instead
use
a simple test
for
size:
if
(
@an_array
) {
print
"has array elements\n"
}
if
(
%a_hash
) {
print
"has hash members\n"
}
When used on a hash element, it tells you whether the value is
defined
,
not whether the key
exists
in the hash. Use L</
exists
>
for
the latter
purpose.
Examples:
print
if
defined
$switch
{
'D'
};
print
"$val\n"
while
defined
(
$val
=
pop
(
@ary
));
die
"Can't readlink $sym: $!"
unless
defined
(
$value
=
readlink
$sym
);
sub
foo {
defined
&$bar
?
&$bar
(
@_
) :
die
"No bar"
; }
$debugging
= 0
unless
defined
$debugging
;
Note: Many folks tend to overuse C<
defined
>, and then are surprised to
discover that the number C<0> and C<
""
> (the zero-
length
string) are, in fact,
defined
values
. For example,
if
you
say
"ab"
=~ /a(.*)b/;
The pattern match succeeds, and C<$1> is
defined
, despite the fact that it
matched
"nothing"
. It didn't really fail to match anything. Rather, it
matched something that happened to be zero characters long. This is all
very above-board and honest. When a function returns an undefined value,
it
's an admission that it couldn'
t give you an honest answer. So you
should
use
C<
defined
> only
when
you're questioning the integrity of what
you're trying to
do
. At other
times
, a simple comparison to C<0> or C<
""
> is
what you want.
See also L</
undef
>, L</
exists
>, L</
ref
>.
=item
delete
EXPR
Given an expression that specifies a hash element, array element, hash slice,
or array slice, deletes the specified element(s) from the hash or array.
In the case of an array,
if
the array elements happen to be at the end,
the size of the array will shrink to the highest element that tests
true
for
exists
() (or 0
if
no
such element
exists
).
Returns a list
with
the same number of elements as the number of elements
for
which deletion was attempted. Each element of that list consists of
either the value of the element deleted, or the undefined value. In
scalar
context, this means that you get the value of the
last
element deleted (or
the undefined value
if
that element did not exist).
%hash
= (
foo
=> 11,
bar
=> 22,
baz
=> 33);
$scalar
=
delete
$hash
{foo};
$scalar
=
delete
@hash
{
qw(foo bar)
};
@array
=
delete
@hash
{
qw(foo bar baz)
};
Deleting from C<
%ENV
> modifies the environment. Deleting from
a hash
tied
to a DBM file deletes the entry from the DBM file. Deleting
from a C<
tie
>d hash or array may not necessarily
return
anything.
Deleting an array element effectively returns that position of the array
to its initial, uninitialized state. Subsequently testing
for
the same
element
with
exists
() will
return
false. Also, deleting array elements
in the middle of an array will not
shift
the
index
of the elements
after
them down. Use
splice
()
for
that. See L</
exists
>.
The following (inefficiently) deletes all the
values
of
%HASH
and
@ARRAY
:
foreach
$key
(
keys
%HASH
) {
delete
$HASH
{
$key
};
}
foreach
$index
(0 ..
$#ARRAY
) {
delete
$ARRAY
[
$index
];
}
And so
do
these:
delete
@HASH
{
keys
%HASH
};
delete
@ARRAY
[0 ..
$#ARRAY
];
But both of these are slower than just assigning the empty list
or undefining
%HASH
or
@ARRAY
:
%HASH
= ();
undef
%HASH
;
@ARRAY
= ();
undef
@ARRAY
;
Note that the EXPR can be arbitrarily complicated as long as the final
operation is a hash element, array element, hash slice, or array slice
lookup:
delete
$ref
->[
$x
][
$y
]{
$key
};
delete
@{
$ref
->[
$x
][
$y
]}{
$key1
,
$key2
,
@morekeys
};
delete
$ref
->[
$x
][
$y
][
$index
];
delete
@{
$ref
->[
$x
][
$y
]}[
$index1
,
$index2
,
@moreindices
];
=item
die
LIST
Outside an C<
eval
>, prints the value of LIST to C<STDERR> and
exits
with
the current value of C<$!> (errno). If C<$!> is C<0>,
exits
with
the value of C<<< ($? >> 8) >>> (backtick `command`
status). If C<<< ($? >> 8) >>> is C<0>, exits
with
C<255>. Inside
an C<
eval
(),> the error message is stuffed into C<$@> and the
C<
eval
> is terminated
with
the undefined value. This makes
C<
die
> the way to raise an exception.
Equivalent examples:
die
"Can't cd to spool: $!\n"
unless
chdir
'/usr/spool/news';
chdir
'/usr/spool/news'
or
die
"Can't cd to spool: $!\n"
If the
last
element of LIST does not end in a newline, the current
script line number and input line number (
if
any) are also printed,
and a newline is supplied. Note that the
"input line number"
(also
known as
"chunk"
) is subject to whatever notion of
"line"
happens to
be currently in effect, and is also available as the special variable
C<$.>. See L<perlvar/
"$/"
> and L<perlvar/
"$."
>.
Hint: sometimes appending C<
", stopped"
> to your message will cause it
to make better sense
when
the string C<
"at foo line 123"
> is appended.
Suppose you are running script
"canasta"
.
die
"/etc/games is no good"
;
die
"/etc/games is no good, stopped"
;
produce, respectively
/etc/games is
no
good at canasta line 123.
/etc/games is
no
good, stopped at canasta line 123.
See also
exit
(),
warn
(), and the Carp module.
If LIST is empty and C<$@> already contains a value (typically from a
previous
eval
) that value is reused
after
appending C<
"\t...propagated"
>.
This is useful
for
propagating exceptions:
eval
{ ... };
die
unless
$@ =~ /Expected exception/;
If LIST is empty and C<$@> contains an object reference that
has
a
C<PROPAGATE> method, that method will be called
with
additional file
and line number parameters. The
return
value replaces the value in
C<$@>. i.e. as
if
C<< $@ =
eval
{ $@->PROPAGATE(__FILE__, __LINE__) }; >>
were called.
If C<$@> is empty then the string C<
"Died"
> is used.
die
() can also be called
with
a reference argument. If this happens to be
trapped within an
eval
(), $@ contains the reference. This behavior permits
a more elaborate exception handling implementation using objects that
maintain arbitrary state about the nature of the exception. Such a scheme
is sometimes preferable to matching particular string
values
of $@ using
regular expressions. Here's an example:
eval
{ ... ;
die
Some::Module::Exception->new(
FOO
=>
"bar"
) };
if
($@) {
if
(blessed($@) && $@->isa(
"Some::Module::Exception"
)) {
}
else
{
}
}
Because perl will stringify uncaught exception messages
before
displaying
them, you may want to overload stringification operations on such custom
exception objects. See L<overload>
for
details about that.
You can arrange
for
a callback to be run just
before
the C<
die
>
does its deed, by setting the C<
$SIG
{__DIE__}> hook. The associated
handler will be called
with
the error text and can change the error
message,
if
it sees fit, by calling C<
die
> again. See
L<perlvar/
$SIG
{expr}>
for
details on setting C<
%SIG
> entries, and
L<
"eval BLOCK"
>
for
some examples. Although this feature was
to be run only right
before
your program was to
exit
, this is not
currently the case--the C<
$SIG
{__DIE__}> hook is currently called
even inside
eval
()ed blocks/strings! If one wants the hook to
do
nothing in such situations, put
die
@_
if
$^S;
as the first line of the handler (see L<perlvar/$^S>). Because
this promotes strange action at a distance, this counterintuitive
behavior may be fixed in a future release.
=item
do
BLOCK
Not really a function. Returns the value of the
last
command in the
sequence of commands indicated by BLOCK. When modified by the C<
while
> or
C<
until
> loop modifier, executes the BLOCK once
before
testing the loop
condition. (On other statements the loop modifiers test the conditional
first.)
C<
do
BLOCK> does I<not> count as a loop, so the loop control statements
C<
next
>, C<
last
>, or C<
redo
> cannot be used to leave or restart the block.
See L<perlsyn>
for
alternative strategies.
=item
do
SUBROUTINE(LIST)
This form of subroutine call is deprecated. See L<perlsub>.
=item
do
EXPR
Uses the value of EXPR as a filename and executes the contents of the
file as a Perl script.
do
'stat.pl'
;
is just like
eval
`cat
stat
.pl`;
except that it's more efficient and concise, keeps track of the current
filename
for
error messages, searches the
@INC
directories, and updates
C<
%INC
>
if
the file is found. See L<perlvar/Predefined Names>
for
these
variables. It also differs in that code evaluated
with
C<
do
FILENAME>
cannot see lexicals in the enclosing scope; C<
eval
STRING> does. It's the
same, however, in that it does reparse the file every
time
you call it,
so you probably don't want to
do
this inside a loop.
If C<
do
> cannot
read
the file, it returns
undef
and sets C<$!> to the
error. If C<
do
> can
read
the file but cannot compile it, it
returns
undef
and sets an error message in C<$@>. If the file is
successfully compiled, C<
do
> returns the value of the
last
expression
evaluated.
Note that inclusion of library modules is better done
with
the
C<
use
> and C<
require
> operators, which also
do
automatic error checking
and raise an exception
if
there's a problem.
You might like to
use
C<
do
> to
read
in a program configuration
file. Manual error checking can be done this way:
for
$file
(
"/share/prog/defaults.rc"
,
"$ENV{HOME}/.someprogrc"
)
{
unless
(
$return
=
do
$file
) {
warn
"couldn't parse $file: $@"
if
$@;
warn
"couldn't do $file: $!"
unless
defined
$return
;
warn
"couldn't run $file"
unless
$return
;
}
}
=item
dump
LABEL
=item
dump
This function causes an immediate core
dump
. See also the B<-u>
command-line switch in L<perlrun>, which does the same thing.
Primarily this is so that you can
use
the B<undump> program (not
supplied) to turn your core
dump
into an executable binary
after
having initialized all your variables at the beginning of the
program. When the new binary is executed it will begin by executing
a C<
goto
LABEL> (
with
all the restrictions that C<
goto
> suffers).
Think of it as a
goto
with
an intervening core
dump
and reincarnation.
If C<LABEL> is omitted, restarts the program from the top.
B<WARNING>: Any files opened at the
time
of the
dump
will I<not>
be
open
any more
when
the program is reincarnated,
with
possible
resulting confusion on the part of Perl.
This function is now largely obsolete, partly because it's very
hard to convert a core file into an executable, and because the
real compiler backends
for
generating portable bytecode and compilable
C code have superseded it. That's why you should now invoke it as
C<CORE::
dump
()>,
if
you don't want to be warned against a possible
typo.
If you're looking to
use
L<
dump
> to speed up your program, consider
generating bytecode or native C code as described in L<perlcc>. If
you're just trying to accelerate a CGI script, consider using the
C<mod_perl> extension to B<Apache>, or the CPAN module, CGI::Fast.
You might also consider autoloading or selfloading, which at least
make your program I<appear> to run faster.
=item
each
HASH
When called in list context, returns a 2-element list consisting of the
key and value
for
the
next
element of a hash, so that you can iterate over
it. When called in
scalar
context, returns only the key
for
the
next
element in the hash.
Entries are returned in an apparently random order. The actual random
order is subject to change in future versions of perl, but it is
guaranteed to be in the same order as either the C<
keys
> or C<
values
>
function would produce on the same (unmodified) hash. Since Perl
5.8.1 the ordering is different even between different runs of Perl
for
security reasons (see L<perlsec/
"Algorithmic Complexity Attacks"
>).
When the hash is entirely
read
, a null array is returned in list context
(which
when
assigned produces a false (C<0>) value), and C<
undef
> in
scalar
context. The
next
call to C<
each
>
after
that will start iterating
again. There is a single iterator
for
each
hash, shared by all C<
each
>,
C<
keys
>, and C<
values
> function calls in the program; it can be
reset
by
reading all the elements from the hash, or by evaluating C<
keys
HASH> or
C<
values
HASH>. If you add or
delete
elements of a hash
while
you're
iterating over it, you may get entries skipped or duplicated, so
don't. Exception: It is always safe to
delete
the item most recently
returned by C<
each
()>, which means that the following code will work:
while
((
$key
,
$value
) =
each
%hash
) {
print
$key
,
"\n"
;
delete
$hash
{
$key
};
}
The following prints out your environment like the printenv(1) program,
only in a different order:
while
((
$key
,
$value
) =
each
%ENV
) {
print
"$key=$value\n"
;
}
See also C<
keys
>, C<
values
> and C<
sort
>.
=item
eof
FILEHANDLE
=item
eof
()
=item
eof
Returns 1
if
the
next
read
on FILEHANDLE will
return
end of file, or
if
FILEHANDLE is not
open
. FILEHANDLE may be an expression whose value
gives the real filehandle. (Note that this function actually
reads a character and then C<ungetc>s it, so isn't very useful in an
interactive context.) Do not
read
from a terminal file (or call
C<
eof
(FILEHANDLE)> on it)
after
end-of-file is reached. File types such
as terminals may lose the end-of-file condition
if
you
do
.
An C<
eof
> without an argument uses the
last
file
read
. Using C<
eof
()>
with
empty parentheses is very different. It refers to the pseudo file
formed from the files listed on the command line and accessed via the
C<< <> >> operator. Since C<< <> >> isn't explicitly opened,
as a normal filehandle is, an C<
eof
()>
before
C<< <> >>
has
been
used will cause C<
@ARGV
> to be examined to determine
if
input is
available. Similarly, an C<
eof
()>
after
C<< <> >>
has
returned
end-of-file will assume you are processing another C<
@ARGV
> list,
and
if
you haven't set C<
@ARGV
>, will
read
input from C<STDIN>;
see L<perlop/
"I/O Operators"
>.
In a C<<
while
(<>) >> loop, C<
eof
> or C<
eof
(ARGV)> can be used to
detect the end of
each
file, C<
eof
()> will only detect the end of the
last
file. Examples:
while
(<>) {
next
if
/^\s*
print
"$.\t$_"
;
}
continue
{
close
ARGV
if
eof
;
}
while
(<>) {
if
(
eof
()) {
print
"--------------\n"
;
}
print
;
last
if
eof
();
}
Practical hint: you almost never need to
use
C<
eof
> in Perl, because the
input operators typically
return
C<
undef
>
when
they run out of data, or
if
there was an error.
=item
eval
EXPR
=item
eval
BLOCK
=item
eval
In the first form, the
return
value of EXPR is parsed and executed as
if
it
were a little Perl program. The value of the expression (which is itself
determined within
scalar
context) is first parsed, and
if
there weren't any
errors, executed in the lexical context of the current Perl program, so
that any variable settings or subroutine and
format
definitions remain
afterwards. Note that the value is parsed every
time
the C<
eval
> executes.
If EXPR is omitted, evaluates C<
$_
>. This form is typically used to
delay parsing and subsequent execution of the text of EXPR
until
run
time
.
In the second form, the code within the BLOCK is parsed only once--at the
same
time
the code surrounding the C<
eval
> itself was parsed--and executed
within the context of the current Perl program. This form is typically
used to trap exceptions more efficiently than the first (see below),
while
also providing the benefit of checking the code within BLOCK at compile
time
.
The final semicolon,
if
any, may be omitted from the value of EXPR or within
the BLOCK.
In both forms, the value returned is the value of the
last
expression
evaluated inside the mini-program; a
return
statement may be also used, just
as
with
subroutines. The expression providing the
return
value is evaluated
in void,
scalar
, or list context, depending on the context of the C<
eval
>
itself. See L</
wantarray
>
for
more on how the evaluation context can be
determined.
If there is a syntax error or runtime error, or a C<
die
> statement is
executed, an undefined value is returned by C<
eval
>, and C<$@> is set to the
error message. If there was
no
error, C<$@> is guaranteed to be a null
string. Beware that using C<
eval
> neither silences perl from printing
warnings to STDERR, nor does it stuff the text of warning messages into C<$@>.
To
do
either of those, you have to
use
the C<
$SIG
{__WARN__}> facility, or
turn off warnings inside the BLOCK or EXPR using S<C<
no
warnings
'all'
>>.
See L</
warn
>, L<perlvar>, L<warnings> and L<perllexwarn>.
Note that, because C<
eval
> traps otherwise-fatal errors, it is useful
for
determining whether a particular feature (such as C<
socket
> or C<
symlink
>)
is implemented. It is also Perl's exception trapping mechanism, where
the
die
operator is used to raise exceptions.
If the code to be executed doesn't vary, you may
use
the
eval
-BLOCK
form to trap run-
time
errors without incurring the penalty of
recompiling
each
time
. The error,
if
any, is still returned in C<$@>.
Examples:
eval
{
$answer
=
$a
/
$b
; };
warn
$@
if
$@;
eval
'$answer = $a / $b'
;
warn
$@
if
$@;
eval
{
$answer
= };
eval
'$answer ='
;
Using the C<
eval
{}> form as an exception trap in libraries does have some
issues. Due to the current arguably broken state of C<__DIE__> hooks, you
may wish not to trigger any C<__DIE__> hooks that user code may have installed.
You can
use
the C<
local
$SIG
{__DIE__}> construct
for
this purpose,
as shown in this example:
eval
{
local
$SIG
{
'__DIE__'
};
$answer
=
$a
/
$b
; };
warn
$@
if
$@;
This is especially significant,
given
that C<__DIE__> hooks can call
C<
die
> again, which
has
the effect of changing their error messages:
{
local
$SIG
{
'__DIE__'
} =
sub
{ (
my
$x
=
$_
[0]) =~ s/foo/bar/g;
die
$x
};
eval
{
die
"foo lives here"
};
print
$@
if
$@;
}
Because this promotes action at a distance, this counterintuitive behavior
may be fixed in a future release.
With an C<
eval
>, you should be especially careful to remember what's
being looked at
when
:
eval
$x
;
eval
"$x"
;
eval
'$x'
;
eval
{
$x
};
eval
"\$$x++"
;
$$x
++;
Cases 1 and 2 above behave identically: they run the code contained in
the variable
$x
. (Although case 2
has
misleading double quotes making
the reader wonder what
else
might be happening (nothing is).) Cases 3
and 4 likewise behave in the same way: they run the code C<
'$x'
>, which
does nothing but
return
the value of
$x
. (Case 4 is preferred
for
purely visual reasons, but it also
has
the advantage of compiling at
compile-
time
instead of at run-
time
.) Case 5 is a place where
normally you I<would> like to
use
double quotes, except that in this
particular situation, you can just
use
symbolic references instead, as
in case 6.
C<
eval
BLOCK> does I<not> count as a loop, so the loop control statements
C<
next
>, C<
last
>, or C<
redo
> cannot be used to leave or restart the block.
Note that as a very special case, an C<
eval
''
> executed within the C<DB>
package
doesn't see the usual surrounding lexical scope, but rather the
scope of the first non-DB piece of code that called it. You don't normally
need to worry about this
unless
you are writing a Perl debugger.
=item
exec
LIST
=item
exec
PROGRAM LIST
The C<
exec
> function executes a
system
command I<and never returns>--
use
C<
system
> instead of C<
exec
>
if
you want it to
return
. It fails and
returns false only
if
the command does not exist I<and> it is executed
directly instead of via your
system
's command shell (see below).
Since it's a common mistake to
use
C<
exec
> instead of C<
system
>, Perl
warns you
if
there is a following statement which isn't C<
die
>, C<
warn
>,
or C<
exit
> (
if
C<-w> is set - but you always
do
that). If you
I<really> want to follow an C<
exec
>
with
some other statement, you
can
use
one of these styles to avoid the warning:
exec
(
'foo'
) or
print
STDERR
"couldn't exec foo: $!"
;
{
exec
(
'foo'
) };
print
STDERR
"couldn't exec foo: $!"
;
If there is more than one argument in LIST, or
if
LIST is an array
with
more than one value, calls execvp(3)
with
the arguments in LIST.
If there is only one
scalar
argument or an array
with
one element in it,
the argument is checked
for
shell metacharacters, and
if
there are any,
the entire argument is passed to the
system
's command shell
for
parsing
(this is C</bin/sh -c> on Unix platforms, but varies on other platforms).
If there are
no
shell metacharacters in the argument, it is
split
into
words and passed directly to C<execvp>, which is more efficient.
Examples:
exec
'/bin/echo'
,
'Your arguments are: '
,
@ARGV
;
exec
"sort $outfile | uniq"
;
If you don't really want to execute the first argument, but want to lie
to the program you are executing about its own name, you can specify
the program you actually want to run as an
"indirect object"
(without a
comma) in front of the LIST. (This always forces interpretation of the
LIST as a multivalued list, even
if
there is only a single
scalar
in
the list.) Example:
$shell
=
'/bin/csh'
;
exec
$shell
'-sh'
;
or, more directly,
exec
{
'/bin/csh'
}
'-sh'
;
When the arguments get executed via the
system
shell, results will
be subject to its quirks and capabilities. See L<perlop/
"`STRING`"
>
for
details.
Using an indirect object
with
C<
exec
> or C<
system
> is also more
secure. This usage (which also works fine
with
system
()) forces
interpretation of the arguments as a multivalued list, even
if
the
list had just one argument. That way you're safe from the shell
expanding wildcards or splitting up words
with
whitespace in them.
@args
= (
"echo surprise"
);
exec
@args
;
exec
{
$args
[0] }
@args
;
The first version, the one without the indirect object, ran the I<echo>
program, passing it C<
"surprise"
> an argument. The second version
didn't--it tried to run a program literally called I<
"echo surprise"
>,
didn't find it, and set C<$?> to a non-zero value indicating failure.
Beginning
with
v5.6.0, Perl will attempt to flush all files opened
for
output
before
the
exec
, but this may not be supported on some platforms
(see L<perlport>). To be safe, you may need to set C<$|> (
$AUTOFLUSH
in English) or call the C<autoflush()> method of C<IO::Handle> on any
open
handles in order to avoid lost output.
Note that C<
exec
> will not call your C<END> blocks, nor will it call
any C<DESTROY> methods in your objects.
=item
exists
EXPR
Given an expression that specifies a hash element or array element,
returns true
if
the specified element in the hash or array
has
ever
been initialized, even
if
the corresponding value is undefined. The
element is not autovivified
if
it doesn't exist.
print
"Exists\n"
if
exists
$hash
{
$key
};
print
"Defined\n"
if
defined
$hash
{
$key
};
print
"True\n"
if
$hash
{
$key
};
print
"Exists\n"
if
exists
$array
[
$index
];
print
"Defined\n"
if
defined
$array
[
$index
];
print
"True\n"
if
$array
[
$index
];
A hash or array element can be true only
if
it's
defined
, and
defined
if
it
exists
, but the
reverse
doesn't necessarily hold true.
Given an expression that specifies the name of a subroutine,
returns true
if
the specified subroutine
has
ever been declared, even
if
it is undefined. Mentioning a subroutine name
for
exists
or
defined
does not count as declaring it. Note that a subroutine which does not
exist may still be callable: its
package
may have an C<AUTOLOAD>
method that makes it spring into existence the first
time
that it is
called -- see L<perlsub>.
print
"Exists\n"
if
exists
&subroutine
;
print
"Defined\n"
if
defined
&subroutine
;
Note that the EXPR can be arbitrarily complicated as long as the final
operation is a hash or array key lookup or subroutine name:
if
(
exists
$ref
->{A}->{B}->{
$key
}) { }
if
(
exists
$hash
{A}{B}{
$key
}) { }
if
(
exists
$ref
->{A}->{B}->[
$ix
]) { }
if
(
exists
$hash
{A}{B}[
$ix
]) { }
if
(
exists
&{
$ref
->{A}{B}{
$key
}}) { }
Although the deepest nested array or hash will not spring into existence
just because its existence was tested, any intervening ones will.
Thus C<<
$ref
->{
"A"
} >> and C<<
$ref
->{
"A"
}->{
"B"
} >> will spring
into existence due to the existence test
for
the
$key
element above.
This happens anywhere the arrow operator is used, including even:
undef
$ref
;
if
(
exists
$ref
->{
"Some key"
}) { }
print
$ref
;
This surprising autovivification in what does not at first--or even
second--glance appear to be an lvalue context may be fixed in a future
release.
See L<perlref/
"Pseudo-hashes: Using an array as a hash"
>
for
specifics
on how
exists
() acts
when
used on a pseudo-hash.
Use of a subroutine call, rather than a subroutine name, as an argument
to
exists
() is an error.
exists
&sub
;
exists
&sub
();
=item
exit
EXPR
=item
exit
Evaluates EXPR and exits immediately
with
that value. Example:
$ans
= <STDIN>;
exit
0
if
$ans
=~ /^[Xx]/;
See also C<
die
>. If EXPR is omitted, exits
with
C<0> status. The only
universally recognized
values
for
EXPR are C<0>
for
success and C<1>
for
error; other
values
are subject to interpretation depending on the
environment in which the Perl program is running. For example, exiting
69 (EX_UNAVAILABLE) from a I<sendmail> incoming-mail filter will cause
the mailer to
return
the item undelivered, but that's not true everywhere.
Don
't use C<exit> to abort a subroutine if there'
s any chance that
someone might want to trap whatever error happened. Use C<
die
> instead,
which can be trapped by an C<
eval
>.
The
exit
() function does not always
exit
immediately. It calls any
defined
C<END> routines first, but these C<END> routines may not
themselves abort the
exit
. Likewise any object destructors that need to
be called are called
before
the real
exit
. If this is a problem, you
can call C<POSIX:_exit(
$status
)> to avoid END and destructor processing.
See L<perlmod>
for
details.
=item
exp
EXPR
=item
exp
Returns I<e> (the natural logarithm base) to the power of EXPR.
If EXPR is omitted, gives C<
exp
(
$_
)>.
=item
fcntl
FILEHANDLE,FUNCTION,SCALAR
Implements the
fcntl
(2) function. You'll probably have to
say
first to get the correct constant definitions. Argument processing and
value
return
works just like C<
ioctl
> below.
For example:
fcntl
(
$filehandle
, F_GETFL,
$packed_return_buffer
)
or
die
"can't fcntl F_GETFL: $!"
;
You don't have to check
for
C<
defined
> on the
return
from C<
fcntl
>.
Like C<
ioctl
>, it maps a C<0>
return
from the
system
call into
C<
"0 but true"
> in Perl. This string is true in boolean context and C<0>
in numeric context. It is also exempt from the normal B<-w> warnings
on improper numeric conversions.
Note that C<
fcntl
> will produce a fatal error
if
used on a machine that
doesn't implement
fcntl
(2). See the Fcntl module or your
fcntl
(2)
manpage to learn what functions are available on your
system
.
Here's an example of setting a filehandle named C<REMOTE> to be
non-blocking at the
system
level. You'll have to negotiate C<$|>
on your own, though.
use
Fcntl
qw(F_GETFL F_SETFL O_NONBLOCK)
;
$flags
=
fcntl
(REMOTE, F_GETFL, 0)
or
die
"Can't get flags for the socket: $!\n"
;
$flags
=
fcntl
(REMOTE, F_SETFL,
$flags
| O_NONBLOCK)
or
die
"Can't set flags for the socket: $!\n"
;
=item
fileno
FILEHANDLE
Returns the file descriptor
for
a filehandle, or undefined
if
the
filehandle is not
open
. This is mainly useful
for
constructing
bitmaps
for
C<
select
> and low-level POSIX tty-handling operations.
If FILEHANDLE is an expression, the value is taken as an indirect
filehandle, generally its name.
You can
use
this to find out whether two handles refer to the
same underlying descriptor:
if
(
fileno
(THIS) ==
fileno
(THAT)) {
print
"THIS and THAT are dups\n"
;
}
(Filehandles connected to memory objects via new features of C<
open
> may
return
undefined even though they are
open
.)
=item
flock
FILEHANDLE,OPERATION
Calls
flock
(2), or an emulation of it, on FILEHANDLE. Returns true
for
success, false on failure. Produces a fatal error
if
used on a
machine that doesn't implement
flock
(2),
fcntl
(2) locking, or lockf(3).
C<
flock
> is Perl's portable file locking interface, although it locks
only entire files, not records.
Two potentially non-obvious but traditional C<
flock
> semantics are
that it waits indefinitely
until
the
lock
is granted, and that its locks
B<merely advisory>. Such discretionary locks are more flexible, but offer
fewer guarantees. This means that programs that
do
not also
use
C<
flock
>
may modify files locked
with
C<
flock
>. See L<perlport>,
your port's specific documentation, or your
system
-specific
local
manpages
for
details. It
's best to assume traditional behavior if you'
re writing
portable programs. (But
if
you're not, you should as always feel perfectly
free to
write
for
your own
system
's idiosyncrasies (sometimes called
"features"
). Slavish adherence to portability concerns shouldn't get
in the way of your getting your job done.)
OPERATION is one of LOCK_SH, LOCK_EX, or LOCK_UN, possibly combined
with
LOCK_NB. These constants are traditionally valued 1, 2, 8 and 4, but
you can
use
the symbolic names
if
you
import
them from the Fcntl module,
either individually, or as a group using the
':flock'
tag. LOCK_SH
requests a shared
lock
, LOCK_EX requests an exclusive
lock
, and LOCK_UN
releases a previously requested
lock
. If LOCK_NB is bitwise-or'ed
with
LOCK_SH or LOCK_EX then C<
flock
> will
return
immediately rather than blocking
waiting
for
the
lock
(check the
return
status to see
if
you got it).
To avoid the possibility of miscoordination, Perl now flushes FILEHANDLE
before
locking or unlocking it.
Note that the emulation built
with
lockf(3) doesn't provide shared
locks, and it requires that FILEHANDLE be
open
with
write
intent. These
are the semantics that lockf(3) implements. Most
if
not all systems
implement lockf(3) in terms of
fcntl
(2) locking, though, so the
differing semantics shouldn't bite too many people.
Note that the
fcntl
(2) emulation of
flock
(3) requires that FILEHANDLE
be
open
with
read
intent to
use
LOCK_SH and requires that it be
open
Note also that some versions of C<
flock
> cannot
lock
things over the
network; you would need to
use
the more
system
-specific C<
fcntl
>
for
that. If you like you can force Perl to ignore your
system
's
flock
(2)
function, and so provide its own
fcntl
(2)-based emulation, by passing
the switch C<-Ud_flock> to the F<Configure> program
when
you configure
perl.
Here's a mailbox appender
for
BSD systems.
sub
lock
{
flock
(MBOX,LOCK_EX);
seek
(MBOX, 0, 2);
}
sub
unlock {
flock
(MBOX,LOCK_UN);
}
open
(MBOX,
">>/usr/spool/mail/$ENV{'USER'}"
)
or
die
"Can't open mailbox: $!"
;
lock
();
print
MBOX
$msg
,
"\n\n"
;
unlock();
On systems that support a real
flock
(), locks are inherited across
fork
()
calls, whereas those that must resort to the more capricious
fcntl
()
function lose the locks, making it harder to
write
servers.
See also L<DB_File>
for
other
flock
() examples.
=item
fork
Does a
fork
(2)
system
call to create a new process running the
same program at the same point. It returns the child pid to the
parent process, C<0> to the child process, or C<
undef
>
if
the
fork
is
unsuccessful. File descriptors (and sometimes locks on those descriptors)
are shared,
while
everything
else
is copied. On most systems supporting
fork
(), great care
has
gone into making it extremely efficient (
for
example, using copy-on-
write
technology on data pages), making it the
dominant paradigm
for
multitasking over the
last
few decades.
Beginning
with
v5.6.0, Perl will attempt to flush all files opened
for
output
before
forking the child process, but this may not be supported
on some platforms (see L<perlport>). To be safe, you may need to set
C<$|> (
$AUTOFLUSH
in English) or call the C<autoflush()> method of
C<IO::Handle> on any
open
handles in order to avoid duplicate output.
If you C<
fork
> without ever waiting on your children, you will
accumulate zombies. On some systems, you can avoid this by setting
C<
$SIG
{CHLD}> to C<
"IGNORE"
>. See also L<perlipc>
for
more examples of
forking and reaping moribund children.
Note that
if
your forked child inherits
system
file descriptors like
STDIN and STDOUT that are actually connected by a
pipe
or
socket
, even
if
you
exit
, then the remote server (such as,
say
, a CGI script or a
backgrounded job launched from a remote shell) won
't think you'
re done.
You should reopen those to F</dev/null>
if
it's any issue.
=item
format
Declare a picture
format
for
use
by the C<
write
> function. For
example:
format
Something =
Test: @<<<<<<<< @||||| @>>>>>
$str
, $%,
'$'
.
int
(
$num
)
.
$str
=
"widget"
;
$num
=
$cost
/
$quantity
;
$~ =
'Something'
;
write
;
See L<perlform>
for
many details and examples.
=item
formline
PICTURE,LIST
This is an internal function used by C<
format
>s, though you may call it,
too. It formats (see L<perlform>) a list of
values
according to the
contents of PICTURE, placing the output into the
format
output
accumulator, C<$^A> (or C<
$ACCUMULATOR
> in English).
Eventually,
when
a C<
write
> is done, the contents of
C<$^A> are written to some filehandle. You could also
read
C<$^A>
and then set C<$^A> back to C<
""
>. Note that a
format
typically
does one C<
formline
> per line of form, but the C<
formline
> function itself
doesn't care how many newlines are embedded in the PICTURE. This means
that the C<~> and C<~~> tokens will treat the entire PICTURE as a single line.
You may therefore need to
use
multiple formlines to implement a single
record
format
, just like the
format
compiler.
Be careful
if
you put double quotes
around
the picture, because an C<@>
character may be taken to mean the beginning of an array name.
C<
formline
> always returns true. See L<perlform>
for
other examples.
=item
getc
FILEHANDLE
=item
getc
Returns the
next
character from the input file attached to FILEHANDLE,
or the undefined value at end of file, or
if
there was an error (in
the latter case C<$!> is set). If FILEHANDLE is omitted, reads from
STDIN. This is not particularly efficient. However, it cannot be
used by itself to fetch single characters without waiting
for
the user
to hit enter. For that,
try
something more like:
if
(
$BSD_STYLE
) {
system
"stty cbreak </dev/tty >/dev/tty 2>&1"
;
}
else
{
system
"stty"
,
'-icanon'
,
'eol'
,
"\001"
;
}
$key
=
getc
(STDIN);
if
(
$BSD_STYLE
) {
system
"stty -cbreak </dev/tty >/dev/tty 2>&1"
;
}
else
{
system
"stty"
,
'icanon'
,
'eol'
,
'^@'
;
}
print
"\n"
;
Determination of whether
$BSD_STYLE
should be set
is left as an exercise to the reader.
The C<POSIX::getattr> function can
do
this more portably on
systems purporting POSIX compliance. See also the C<Term::ReadKey>
module from your nearest CPAN site; details on CPAN can be found on
L<perlmodlib/CPAN>.
=item
getlogin
This implements the C library function of the same name, which on most
systems returns the current login from F</etc/utmp>,
if
any. If null,
$login
=
getlogin
||
getpwuid
($<) ||
"Kilroy"
;
Do not consider C<
getlogin
>
for
authentication: it is not as
secure as C<
getpwuid
>.
=item
getpeername
SOCKET
Returns the packed sockaddr address of other end of the SOCKET connection.
$hersockaddr
=
getpeername
(SOCK);
(
$port
,
$iaddr
) = sockaddr_in(
$hersockaddr
);
$herhostname
=
gethostbyaddr
(
$iaddr
, AF_INET);
$herstraddr
= inet_ntoa(
$iaddr
);
=item
getpgrp
PID
Returns the current process group
for
the specified PID. Use
a PID of C<0> to get the current process group
for
the
current process. Will raise an exception
if
used on a machine that
doesn't implement
getpgrp
(2). If PID is omitted, returns process
group of current process. Note that the POSIX version of C<
getpgrp
>
does not
accept
a PID argument, so only C<PID==0> is truly portable.
=item
getppid
Returns the process id of the parent process.
Note
for
Linux users: on Linux, the C functions C<getpid()> and
C<
getppid
()>
return
different
values
from different threads. In order to
be portable, this behavior is not reflected by the perl-level function
C<
getppid
()>, that returns a consistent value across threads. If you want
to call the underlying C<
getppid
()>, you may
use
the CPAN module
C<Linux::Pid>.
=item
getpriority
WHICH,WHO
Returns the current priority
for
a process, a process group, or a user.
(See L<
getpriority
(2)>.) Will raise a fatal exception
if
used on a
machine that doesn't implement
getpriority
(2).
=item
getpwnam
NAME
=item
getgrnam
NAME
=item
gethostbyname
NAME
=item
getnetbyname
NAME
=item
getprotobyname
NAME
=item
getpwuid
UID
=item
getgrgid
GID
=item
getservbyname
NAME,PROTO
=item
gethostbyaddr
ADDR,ADDRTYPE
=item
getnetbyaddr
ADDR,ADDRTYPE
=item
getprotobynumber
NUMBER
=item
getservbyport
PORT,PROTO
=item
getpwent
=item
getgrent
=item
gethostent
=item
getnetent
=item
getprotoent
=item
getservent
=item
setpwent
=item
setgrent
=item
sethostent
STAYOPEN
=item
setnetent
STAYOPEN
=item
setprotoent
STAYOPEN
=item
setservent
STAYOPEN
=item
endpwent
=item
endgrent
=item
endhostent
=item
endnetent
=item
endprotoent
=item
endservent
These routines perform the same functions as their counterparts in the
system
library. In list context, the
return
values
from the
various get routines are as follows:
(
$name
,
$passwd
,
$uid
,
$gid
,
$quota
,
$comment
,
$gcos
,
$dir
,
$shell
,
$expire
) = getpw*
(
$name
,
$passwd
,
$gid
,
$members
) = getgr*
(
$name
,
$aliases
,
$addrtype
,
$length
,
@addrs
) = gethost*
(
$name
,
$aliases
,
$addrtype
,
$net
) = getnet*
(
$name
,
$aliases
,
$proto
) = getproto*
(
$name
,
$aliases
,
$port
,
$proto
) = getserv*
(If the entry doesn't exist you get a null list.)
The exact meaning of the
$gcos
field varies but it usually contains
the real name of the user (as opposed to the login name) and other
information pertaining to the user. Beware, however, that in many
system
users are able to change this information and therefore it
cannot be trusted and therefore the
$gcos
is tainted (see
L<perlsec>). The
$passwd
and
$shell
, user's encrypted password and
login shell, are also tainted, because of the same reason.
In
scalar
context, you get the name,
unless
the function was a
lookup by name, in which case you get the other thing, whatever it is.
(If the entry doesn't exist you get the undefined value.) For example:
$uid
=
getpwnam
(
$name
);
$name
=
getpwuid
(
$num
);
$name
=
getpwent
();
$gid
=
getgrnam
(
$name
);
$name
=
getgrgid
(
$num
);
$name
=
getgrent
();
In I<getpw*()> the fields
$quota
,
$comment
, and
$expire
are special
cases in the sense that in many systems they are unsupported. If the
$quota
is unsupported, it is an empty
scalar
. If it is supported, it
usually encodes the disk quota. If the
$comment
field is unsupported,
it is an empty
scalar
. If it is supported it usually encodes some
administrative comment about the user. In some systems the
$quota
field may be
$change
or
$age
, fields that have to
do
with
password
aging. In some systems the
$comment
field may be
$class
. The
$expire
field,
if
present, encodes the expiration period of the account or the
password. For the availability and the exact meaning of these fields
in your
system
, please consult your
getpwnam
(3) documentation and your
F<pwd.h> file. You can also find out from within Perl what your
$quota
and
$comment
fields mean and whether you have the
$expire
field
by using the C<Config> module and the
values
C<d_pwquota>, C<d_pwage>,
C<d_pwchange>, C<d_pwcomment>, and C<d_pwexpire>. Shadow password
files are only supported
if
your vendor
has
implemented them in the
intuitive fashion that calling the regular C library routines gets the
shadow versions
if
you're running under privilege or
if
there
exists
the shadow(3) functions as found in System V (this includes Solaris
and Linux.) Those systems that implement a proprietary shadow password
facility are unlikely to be supported.
The
$members
value returned by I<getgr*()> is a space separated list of
the login names of the members of the group.
For the I<gethost*()> functions,
if
the C<h_errno> variable is supported in
C, it will be returned to you via C<$?>
if
the function call fails. The
C<
@addrs
> value returned by a successful call is a list of the raw
addresses returned by the corresponding
system
library call. In the
Internet domain,
each
address is four bytes long and you can
unpack
it
by saying something like:
(
$a
,
$b
,
$c
,
$d
) =
unpack
(
'C4'
,
$addr
[0]);
The Socket library makes this slightly easier:
$iaddr
= inet_aton(
"127.1"
);
$name
=
gethostbyaddr
(
$iaddr
, AF_INET);
$straddr
= inet_ntoa(
$iaddr
);
If you get tired of remembering which element of the
return
list
contains which
return
value, by-name interfaces are provided
in standard modules: C<File::
stat
>, C<Net::hostent>, C<Net::netent>,
C<Net::protoent>, C<Net::servent>, C<Time::
gmtime
>, C<Time::
localtime
>,
and C<User::grent>. These
override
the normal built-ins, supplying
versions that
return
objects
with
the appropriate names
for
each
field. For example:
$is_his
= (
stat
(
$filename
)->uid == pwent(
$whoever
)->uid);
Even though it looks like they're the same method calls (uid),
they aren't, because a C<File::
stat
> object is different from
a C<User::pwent> object.
=item
getsockname
SOCKET
Returns the packed sockaddr address of this end of the SOCKET connection,
in case you don't know the address because you have several different
IPs that the connection might have come in on.
$mysockaddr
=
getsockname
(SOCK);
(
$port
,
$myaddr
) = sockaddr_in(
$mysockaddr
);
printf
"Connect to %s [%s]\n"
,
scalar
gethostbyaddr
(
$myaddr
, AF_INET),
inet_ntoa(
$myaddr
);
=item
getsockopt
SOCKET,LEVEL,OPTNAME
Queries the option named OPTNAME associated
with
SOCKET at a
given
LEVEL.
Options may exist at multiple protocol levels depending on the
socket
type, but at least the uppermost
socket
level SOL_SOCKET (
defined
in the
C<Socket> module) will exist. To query options at another level the
protocol number of the appropriate protocol controlling the option
should be supplied. For example, to indicate that an option is to be
interpreted by the TCP protocol, LEVEL should be set to the protocol
number of TCP, which you can get using
getprotobyname
.
The call returns a packed string representing the requested
socket
option,
or C<
undef
>
if
there is an error (the error reason will be in $!). What
exactly is in the packed string depends in the LEVEL and OPTNAME, consult
your
system
documentation
for
details. A very common case however is that
the option is an integer, in which case the result will be a packed
integer which you can decode using
unpack
with
the C<i> (or C<I>)
format
.
An example testing
if
Nagle's algorithm is turned on on a
socket
:
defined
(
my
$tcp
=
getprotobyname
(
"tcp"
))
or
die
"Could not determine the protocol number for tcp"
;
my
$packed
=
getsockopt
(
$socket
,
$tcp
, TCP_NODELAY)
or
die
"Could not query TCP_NODELAY socket option: $!"
;
my
$nodelay
=
unpack
(
"I"
,
$packed
);
print
"Nagle's algorithm is turned "
,
$nodelay
?
"off\n"
:
"on\n"
;
=item
glob
EXPR
=item
glob
In list context, returns a (possibly empty) list of filename expansions on
the value of EXPR such as the standard Unix shell F</bin/csh> would
do
. In
scalar
context,
glob
iterates through such filename expansions, returning
undef
when
the list is exhausted. This is the internal function
implementing the C<< <*.c> >> operator, but you can
use
it directly. If
EXPR is omitted, C<
$_
> is used. The C<< <*.c> >> operator is discussed in
more detail in L<perlop/
"I/O Operators"
>.
Beginning
with
v5.6.0, this operator is implemented using the standard
C<File::Glob> extension. See L<File::Glob>
for
details.
=item
gmtime
EXPR
=item
gmtime
Converts a
time
as returned by the
time
function to an 8-element list
with
the
time
localized
for
the standard Greenwich
time
zone.
Typically used as follows:
(
$sec
,
$min
,
$hour
,
$mday
,
$mon
,
$year
,
$wday
,
$yday
) =
gmtime
(
time
);
All list elements are numeric, and come straight out of the C `struct
tm'.
$sec
,
$min
, and
$hour
are the seconds, minutes, and hours of the
specified
time
.
$mday
is the day of the month, and
$mon
is the month
itself, in the range C<0..11>
with
0 indicating January and 11
indicating December.
$year
is the number of years since 1900. That
is,
$year
is C<123> in year 2023.
$wday
is the day of the week,
with
0 indicating Sunday and 3 indicating Wednesday.
$yday
is the day of
the year, in the range C<0..364> (or C<0..365> in leap years.)
Note that the
$year
element is I<not> simply the
last
two digits of
the year. If you assume it is then you create non-Y2K-compliant
programs--and you wouldn't want to
do
that, would you?
The proper way to get a complete 4-digit year is simply:
$year
+= 1900;
And to get the
last
two digits of the year (e.g.,
'01'
in 2001)
do
:
$year
=
sprintf
(
"%02d"
,
$year
% 100);
If EXPR is omitted, C<
gmtime
()> uses the current
time
(C<
gmtime
(
time
)>).
In
scalar
context, C<
gmtime
()> returns the ctime(3) value:
$now_string
=
gmtime
;
If you need
local
time
instead of GMT
use
the L</
localtime
> builtin.
See also the C<timegm> function provided by the C<Time::Local> module,
and the strftime(3) and mktime(3) functions available via the L<POSIX> module.
This
scalar
value is B<not> locale dependent (see L<perllocale>), but is
instead a Perl builtin. To get somewhat similar but locale dependent date
strings, see the example in L</
localtime
>.
See L<perlport/
gmtime
>
for
portability concerns.
=item
goto
LABEL
=item
goto
EXPR
=item
goto
&NAME
The C<
goto
-LABEL> form finds the statement labeled
with
LABEL and resumes
execution there. It may not be used to go into any construct that
requires initialization, such as a subroutine or a C<
foreach
> loop. It
also can't be used to go into a construct that is optimized away,
or to get out of a block or subroutine
given
to C<
sort
>.
It can be used to go almost anywhere
else
within the dynamic scope,
including out of subroutines, but it's usually better to
use
some other
construct such as C<
last
> or C<
die
>. The author of Perl
has
never felt the
need to
use
this form of C<
goto
> (in Perl, that is--C is another matter).
(The difference being that C does not offer named loops combined
with
loop control. Perl does, and this replaces most structured uses of C<
goto
>
in other languages.)
The C<
goto
-EXPR> form expects a label name, whose scope will be resolved
dynamically. This allows
for
computed C<
goto
>s per FORTRAN, but isn't
necessarily recommended
if
you're optimizing
for
maintainability:
goto
(
"FOO"
,
"BAR"
,
"GLARCH"
)[
$i
];
The C<
goto
-
&NAME
> form is quite different from the other forms of
C<
goto
>. In fact, it isn't a
goto
in the normal sense at all, and
doesn't have the stigma associated
with
other gotos. Instead, it
exits the current subroutine (losing any changes set by
local
()) and
immediately calls in its place the named subroutine using the current
value of
@_
. This is used by C<AUTOLOAD> subroutines that wish to
load another subroutine and then pretend that the other subroutine had
been called in the first place (except that any modifications to C<
@_
>
in the current subroutine are propagated to the other subroutine.)
After the C<
goto
>, not even C<
caller
> will be able to
tell
that this
routine was called first.
NAME needn't be the name of a subroutine; it can be a
scalar
variable
containing a code reference, or a block that evaluates to a code
reference.
=item
grep
BLOCK LIST
=item
grep
EXPR,LIST
This is similar in spirit to, but not the same as,
grep
(1) and its
relatives. In particular, it is not limited to using regular expressions.
Evaluates the BLOCK or EXPR
for
each
element of LIST (locally setting
C<
$_
> to
each
element) and returns the list value consisting of those
elements
for
which the expression evaluated to true. In
scalar
context, returns the number of
times
the expression was true.
@foo
=
grep
(!/^
or equivalently,
@foo
=
grep
{!/^
Note that C<
$_
> is an alias to the list value, so it can be used to
modify the elements of the LIST. While this is useful and supported,
it can cause bizarre results
if
the elements of LIST are not variables.
Similarly,
grep
returns aliases into the original list, much as a
for
loop's
index
variable aliases the list elements. That is, modifying an
element of a list returned by
grep
(
for
example, in a C<
foreach
>, C<
map
>
or another C<
grep
>) actually modifies the element in the original list.
This is usually something to be avoided
when
writing clear code.
See also L</
map
>
for
a list composed of the results of the BLOCK or EXPR.
=item
hex
EXPR
=item
hex
Interprets EXPR as a
hex
string and returns the corresponding value.
(To convert strings that might start
with
either C<0>, C<0x>, or C<0b>, see
L</
oct
>.) If EXPR is omitted, uses C<
$_
>.
print
hex
'0xAf'
;
print
hex
'aF'
;
Hex strings may only represent integers. Strings that would cause
integer overflow trigger a warning. Leading whitespace is not stripped,
unlike
oct
(). To present something as
hex
, look into L</
printf
>,
L</
sprintf
>, or L</
unpack
>.
=item
import
LIST
There is
no
builtin C<
import
> function. It is just an ordinary
method (subroutine)
defined
(or inherited) by modules that wish to export
names to another module. The C<
use
> function calls the C<
import
> method
for
the
package
used. See also L</
use
>, L<perlmod>, and L<Exporter>.
=item
index
STR,SUBSTR,POSITION
=item
index
STR,SUBSTR
The
index
function searches
for
one string within another, but without
the wildcard-like behavior of a full regular-expression pattern match.
It returns the position of the first occurrence of SUBSTR in STR at
or
after
POSITION. If POSITION is omitted, starts searching from the
beginning of the string. POSITION
before
the beginning of the string
or
after
its end is treated as
if
it were the beginning or the end,
respectively. POSITION and the
return
value are based at C<0> (or whatever
you
've set the C<$[> variable to--but don'
t
do
that). If the substring
is not found, C<
index
> returns one less than the base, ordinarily C<-1>.
=item
int
EXPR
=item
int
Returns the integer portion of EXPR. If EXPR is omitted, uses C<
$_
>.
You should not
use
this function
for
rounding: one because it truncates
towards C<0>, and two because machine representations of floating point
numbers can sometimes produce counterintuitive results. For example,
C<
int
(-6.725/0.025)> produces -268 rather than the correct -269; that's
because it's really more like -268.99999999999994315658 instead. Usually,
the C<
sprintf
>, C<
printf
>, or the C<POSIX::floor> and C<POSIX::ceil>
functions will serve you better than will
int
().
=item
ioctl
FILEHANDLE,FUNCTION,SCALAR
Implements the
ioctl
(2) function. You'll probably first have to
say
require
"sys/ioctl.ph"
;
to get the correct function definitions. If F<sys/
ioctl
.ph> doesn't
exist or doesn
't have the correct definitions you'
ll have to roll your
own, based on your C header files such as F<< <sys/
ioctl
.h> >>.
(There is a Perl script called B<h2ph> that comes
with
the Perl kit that
may help you in this, but it's nontrivial.) SCALAR will be
read
and/or
written depending on the FUNCTION--a pointer to the string value of SCALAR
will be passed as the third argument of the actual C<
ioctl
> call. (If SCALAR
has
no
string value but does have a numeric value, that value will be
passed rather than a pointer to the string value. To guarantee this to be
true, add a C<0> to the
scalar
before
using it.) The C<
pack
> and C<
unpack
>
functions may be needed to manipulate the
values
of structures used by
C<
ioctl
>.
The
return
value of C<
ioctl
> (and C<
fcntl
>) is as follows:
if
OS returns: then Perl returns:
-1 undefined value
0 string
"0 but true"
anything
else
that number
Thus Perl returns true on success and false on failure, yet you can
still easily determine the actual value returned by the operating
system
:
$retval
=
ioctl
(...) || -1;
printf
"System returned %d\n"
,
$retval
;
The special string C<
"0 but true"
> is exempt from B<-w> complaints
about improper numeric conversions.
=item
join
EXPR,LIST
Joins the separate strings of LIST into a single string
with
fields
separated by the value of EXPR, and returns that new string. Example:
$rec
=
join
(
':'
,
$login
,
$passwd
,
$uid
,
$gid
,
$gcos
,
$home
,
$shell
);
Beware that unlike C<
split
>, C<
join
> doesn't take a pattern as its
first argument. Compare L</
split
>.
=item
keys
HASH
Returns a list consisting of all the
keys
of the named hash.
(In
scalar
context, returns the number of
keys
.)
The
keys
are returned in an apparently random order. The actual
random order is subject to change in future versions of perl, but it
is guaranteed to be the same order as either the C<
values
> or C<
each
>
function produces (
given
that the hash
has
not been modified). Since
Perl 5.8.1 the ordering is different even between different runs of
Perl
for
security reasons (see L<perlsec/"Algorithmic Complexity
Attacks">).
As a side effect, calling
keys
() resets the HASH's internal iterator
(see L</
each
>). In particular, calling
keys
() in void context resets
the iterator
with
no
other overhead.
Here is yet another way to
print
your environment:
@keys
=
keys
%ENV
;
@values
=
values
%ENV
;
while
(
@keys
) {
print
pop
(
@keys
),
'='
,
pop
(
@values
),
"\n"
;
}
or how about sorted by key:
foreach
$key
(
sort
(
keys
%ENV
)) {
print
$key
,
'='
,
$ENV
{
$key
},
"\n"
;
}
The returned
values
are copies of the original
keys
in the hash, so
modifying them will not affect the original hash. Compare L</
values
>.
To
sort
a hash by value, you'll need to
use
a C<
sort
> function.
Here's a descending numeric
sort
of a hash by its
values
:
foreach
$key
(
sort
{
$hash
{
$b
} <=>
$hash
{
$a
} }
keys
%hash
) {
printf
"%4d %s\n"
,
$hash
{
$key
},
$key
;
}
As an lvalue C<
keys
> allows you to increase the number of hash buckets
allocated
for
the
given
hash. This can gain you a measure of efficiency
if
you know the hash is going to get big. (This is similar to pre-extending
an array by assigning a larger number to
$#array
.) If you
say
keys
%hash
= 200;
then C<
%hash
> will have at least 200 buckets allocated
for
it--256 of them,
in fact, since it rounds up to the
next
power of two. These
buckets will be retained even
if
you
do
C<
%hash
= ()>,
use
C<
undef
%hash
>
if
you want to free the storage
while
C<
%hash
> is still in scope.
You can't shrink the number of buckets allocated
for
the hash using
C<
keys
> in this way (but you needn't worry about doing this by accident,
as trying
has
no
effect).
See also C<
each
>, C<
values
> and C<
sort
>.
=item
kill
SIGNAL, LIST
Sends a signal to a list of processes. Returns the number of
processes successfully signaled (which is not necessarily the
same as the number actually killed).
$cnt
=
kill
1,
$child1
,
$child2
;
kill
9,
@goners
;
If SIGNAL is zero,
no
signal is sent to the process. This is a
useful way to check that a child process is alive and hasn't changed
its UID. See L<perlport>
for
notes on the portability of this
construct.
Unlike in the shell,
if
SIGNAL is negative, it kills
process groups instead of processes. (On System V, a negative I<PROCESS>
number will also
kill
process groups, but that's not portable.) That
means you usually want to
use
positive not negative signals. You may also
use
a signal name in quotes.
See L<perlipc/
"Signals"
>
for
more details.
=item
last
LABEL
=item
last
The C<
last
> command is like the C<break> statement in C (as used in
loops); it immediately exits the loop in question. If the LABEL is
omitted, the command refers to the innermost enclosing loop. The
C<
continue
> block,
if
any, is not executed:
LINE:
while
(<STDIN>) {
last
LINE
if
/^$/;
}
C<
last
> cannot be used to
exit
a block which returns a value such as
C<
eval
{}>, C<
sub
{}> or C<
do
{}>, and should not be used to
exit
a
grep
() or
map
() operation.
Note that a block by itself is semantically identical to a loop
that executes once. Thus C<
last
> can be used to effect an early
exit
out of such a block.
See also L</
continue
>
for
an illustration of how C<
last
>, C<
next
>, and
C<
redo
> work.
=item
lc
EXPR
=item
lc
Returns a lowercased version of EXPR. This is the internal function
implementing the C<\L> escape in double-quoted strings. Respects
current LC_CTYPE locale
if
C<
use
locale> in force. See L<perllocale>
and L<perlunicode>
for
more details about locale and Unicode support.
If EXPR is omitted, uses C<
$_
>.
=item
lcfirst
EXPR
=item
lcfirst
Returns the value of EXPR
with
the first character lowercased. This
is the internal function implementing the C<\l> escape in
double-quoted strings. Respects current LC_CTYPE locale
if
C<
use
locale> in force. See L<perllocale> and L<perlunicode>
for
more
details about locale and Unicode support.
If EXPR is omitted, uses C<
$_
>.
=item
length
EXPR
=item
length
Returns the
length
in I<characters> of the value of EXPR. If EXPR is
omitted, returns
length
of C<
$_
>. Note that this cannot be used on
an entire array or hash to find out how many elements these have.
For that,
use
C<
scalar
@array
> and C<
scalar
keys
%hash
> respectively.
Note the I<characters>:
if
the EXPR is in Unicode, you will get the
number of characters, not the number of bytes. To get the
length
in bytes,
use
C<
do
{
use
bytes;
length
(EXPR) }>, see L<bytes>.
=item
link
OLDFILE,NEWFILE
Creates a new filename linked to the old filename. Returns true
for
success, false otherwise.
=item
listen
SOCKET,QUEUESIZE
Does the same thing that the
listen
system
call does. Returns true
if
it succeeded, false otherwise. See the example in
L<perlipc/
"Sockets: Client/Server Communication"
>.
=item
local
EXPR
You really probably want to be using C<
my
> instead, because C<
local
> isn't
what most people think of as
"local"
. See
L<perlsub/
"Private Variables via my()"
>
for
details.
A
local
modifies the listed variables to be
local
to the enclosing
block, file, or
eval
. If more than one value is listed, the list must
be placed in parentheses. See L<perlsub/
"Temporary Values via local()"
>
for
details, including issues
with
tied
arrays and hashes.
=item
localtime
EXPR
=item
localtime
Converts a
time
as returned by the
time
function to a 9-element list
with
the
time
analyzed
for
the
local
time
zone. Typically used as
follows:
(
$sec
,
$min
,
$hour
,
$mday
,
$mon
,
$year
,
$wday
,
$yday
,
$isdst
) =
localtime
(
time
);
All list elements are numeric, and come straight out of the C `struct
tm'. C<
$sec
>, C<
$min
>, and C<
$hour
> are the seconds, minutes, and hours
of the specified
time
.
C<
$mday
> is the day of the month, and C<
$mon
> is the month itself, in
the range C<0..11>
with
0 indicating January and 11 indicating December.
This makes it easy to get a month name from a list:
my
@abbr
=
qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec )
;
print
"$abbr[$mon] $mday"
;
C<
$year
> is the number of years since 1900, not just the
last
two digits
of the year. That is, C<
$year
> is C<123> in year 2023. The proper way
to get a complete 4-digit year is simply:
$year
+= 1900;
To get the
last
two digits of the year (e.g.,
'01'
in 2001)
do
:
$year
=
sprintf
(
"%02d"
,
$year
% 100);
C<
$wday
> is the day of the week,
with
0 indicating Sunday and 3 indicating
Wednesday. C<
$yday
> is the day of the year, in the range C<0..364>
(or C<0..365> in leap years.)
C<
$isdst
> is true
if
the specified
time
occurs during Daylight Saving
Time, false otherwise.
If EXPR is omitted, C<
localtime
()> uses the current
time
(C<
localtime
(
time
)>).
In
scalar
context, C<
localtime
()> returns the ctime(3) value:
$now_string
=
localtime
;
This
scalar
value is B<not> locale dependent but is a Perl builtin. For GMT
instead of
local
time
use
the L</
gmtime
> builtin. See also the
C<Time::Local> module (to convert the second, minutes, hours, ... back to
the integer value returned by
time
()), and the L<POSIX> module's strftime(3)
and mktime(3) functions.
To get somewhat similar but locale dependent date strings, set up your
locale environment variables appropriately (please see L<perllocale>) and
try
for
example:
$now_string
= strftime
"%a %b %e %H:%M:%S %Y"
,
localtime
;
$now_string
= strftime
"%a %b %e %H:%M:%S %Y"
,
gmtime
;
Note that the C<
%a
> and C<
%b
>, the short forms of the day of the week
and the month of the year, may not necessarily be three characters wide.
See L<perlport/
localtime
>
for
portability concerns.
=item
lock
THING
This function places an advisory
lock
on a shared variable, or referenced
object contained in I<THING>
until
the
lock
goes out of scope.
lock
() is a
"weak keyword"
: this means that
if
you've
defined
a function
by this name (
before
any calls to it), that function will be called
instead. (However,
if
you've said C<
use
threads>,
lock
() is always a
keyword.) See L<threads>.
=item
log
EXPR
=item
log
Returns the natural logarithm (base I<e>) of EXPR. If EXPR is omitted,
returns
log
of C<
$_
>. To get the
log
of another base,
use
basic algebra:
The base-N
log
of a number is equal to the natural
log
of that number
divided by the natural
log
of N. For example:
sub
log10 {
my
$n
=
shift
;
return
log
(
$n
)/
log
(10);
}
See also L</
exp
>
for
the inverse operation.
=item
lstat
EXPR
=item
lstat
Does the same thing as the C<
stat
> function (including setting the
special C<_> filehandle) but stats a symbolic
link
instead of the file
the symbolic
link
points to. If symbolic links are unimplemented on
your
system
, a normal C<
stat
> is done. For much more detailed
information, please see the documentation
for
L</
stat
>.
If EXPR is omitted, stats C<
$_
>.
=item m//
The match operator. See L<perlop>.
=item
map
BLOCK LIST
=item
map
EXPR,LIST
Evaluates the BLOCK or EXPR
for
each
element of LIST (locally setting
C<
$_
> to
each
element) and returns the list value composed of the
results of
each
such evaluation. In
scalar
context, returns the
total number of elements so generated. Evaluates BLOCK or EXPR in
list context, so
each
element of LIST may produce zero, one, or
more elements in the returned value.
@chars
=
map
(
chr
,
@nums
);
translates a list of numbers to the corresponding characters. And
%hash
=
map
{ getkey(
$_
) =>
$_
}
@array
;
is just a funny way to
write
%hash
= ();
foreach
$_
(
@array
) {
$hash
{getkey(
$_
)} =
$_
;
}
Note that C<
$_
> is an alias to the list value, so it can be used to
modify the elements of the LIST. While this is useful and supported,
it can cause bizarre results
if
the elements of LIST are not variables.
Using a regular C<
foreach
> loop
for
this purpose would be clearer in
most cases. See also L</
grep
>
for
an array composed of those items of
the original list
for
which the BLOCK or EXPR evaluates to true.
C<{> starts both hash references and blocks, so C<
map
{ ...> could be either
the start of
map
BLOCK LIST or
map
EXPR, LIST. Because perl doesn't look
ahead
for
the closing C<}> it
has
to take a guess at which its dealing
with
based what it finds just
after
the C<{>. Usually it gets it right, but
if
it
doesn
't it won'
t realize something is wrong
until
it gets to the C<}> and
encounters the missing (or unexpected) comma. The syntax error will be
reported
close
to the C<}> but you'll need to change something near the C<{>
such as using a unary C<+> to give perl some help:
%hash
=
map
{
"\L$_"
, 1 }
@array
%hash
=
map
{ +
"\L$_"
, 1 }
@array
%hash
=
map
{ (
"\L$_"
, 1) }
@array
%hash
=
map
{
lc
(
$_
), 1 }
@array
%hash
=
map
+(
lc
(
$_
), 1 ),
@array
%hash
=
map
(
lc
(
$_
), 1 ),
@array
or to force an anon hash constructor
use
C<+{>
@hashes
=
map
+{
lc
(
$_
), 1 },
@array
and you get list of anonymous hashes
each
with
only 1 entry.
=item
mkdir
FILENAME,MASK
=item
mkdir
FILENAME
Creates the directory specified by FILENAME,
with
permissions
specified by MASK (as modified by C<
umask
>). If it succeeds it
returns true, otherwise it returns false and sets C<$!> (errno).
If omitted, MASK defaults to 0777.
In general, it is better to create directories
with
permissive MASK,
and let the user modify that
with
their C<
umask
>, than it is to supply
a restrictive MASK and give the user
no
way to be more permissive.
The exceptions to this rule are
when
the file or directory should be
kept private (mail files,
for
instance). The perlfunc(1) entry on
C<
umask
> discusses the choice of MASK in more detail.
Note that according to the POSIX 1003.1-1996 the FILENAME may have any
number of trailing slashes. Some operating and filesystems
do
not get
this right, so Perl automatically removes all trailing slashes to keep
everyone happy.
=item
msgctl
ID,CMD,ARG
Calls the System V IPC function
msgctl
(2). You'll probably have to
say
first to get the correct constant definitions. If CMD is C<IPC_STAT>,
then ARG must be a variable that will hold the returned C<msqid_ds>
structure. Returns like C<
ioctl
>: the undefined value
for
error,
C<
"0 but true"
>
for
zero, or the actual
return
value otherwise. See also
L<perlipc/
"SysV IPC"
>, C<IPC::SysV>, and C<IPC::Semaphore> documentation.
=item
msgget
KEY,FLAGS
Calls the System V IPC function
msgget
(2). Returns the message queue
id, or the undefined value
if
there is an error. See also
L<perlipc/
"SysV IPC"
> and C<IPC::SysV> and C<IPC::Msg> documentation.
=item
msgrcv
ID,VAR,SIZE,TYPE,FLAGS
Calls the System V IPC function
msgrcv
to receive a message from
message queue ID into variable VAR
with
a maximum message size of
SIZE. Note that
when
a message is received, the message type as a
native long integer will be the first thing in VAR, followed by the
actual message. This packing may be opened
with
C<
unpack
(
"l! a*"
)>.
Taints the variable. Returns true
if
successful, or false
if
there is
an error. See also L<perlipc/
"SysV IPC"
>, C<IPC::SysV>, and
C<IPC::SysV::Msg> documentation.
=item
msgsnd
ID,MSG,FLAGS
Calls the System V IPC function
msgsnd
to
send
the message MSG to the
message queue ID. MSG must begin
with
the native long integer message
type, and be followed by the
length
of the actual message, and
finally
the message itself. This kind of packing can be achieved
with
C<
pack
(
"l! a*"
,
$type
,
$message
)>. Returns true
if
successful,
or false
if
there is an error. See also C<IPC::SysV>
and C<IPC::SysV::Msg> documentation.
=item
my
EXPR
=item
my
TYPE EXPR
=item
my
EXPR : ATTRS
=item
my
TYPE EXPR : ATTRS
A C<
my
> declares the listed variables to be
local
(lexically) to the
enclosing block, file, or C<
eval
>. If more than one value is listed,
the list must be placed in parentheses.
The exact semantics and interface of TYPE and ATTRS are still
evolving. TYPE is currently bound to the
use
of C<fields> pragma,
and attributes are handled using the C<attributes> pragma, or starting
from Perl 5.8.0 also via the C<Attribute::Handlers> module. See
L<perlsub/
"Private Variables via my()"
>
for
details, and L<fields>,
L<attributes>, and L<Attribute::Handlers>.
=item
next
LABEL
=item
next
The C<
next
> command is like the C<
continue
> statement in C; it starts
the
next
iteration of the loop:
LINE:
while
(<STDIN>) {
next
LINE
if
/^
}
Note that
if
there were a C<
continue
> block on the above, it would get
executed even on discarded lines. If the LABEL is omitted, the command
refers to the innermost enclosing loop.
C<
next
> cannot be used to
exit
a block which returns a value such as
C<
eval
{}>, C<
sub
{}> or C<
do
{}>, and should not be used to
exit
a
grep
() or
map
() operation.
Note that a block by itself is semantically identical to a loop
that executes once. Thus C<
next
> will
exit
such a block early.
See also L</
continue
>
for
an illustration of how C<
last
>, C<
next
>, and
C<
redo
> work.
=item
no
Module VERSION LIST
=item
no
Module VERSION
=item
no
Module LIST
=item
no
Module
See the C<
use
> function, which C<
no
> is the opposite of.
=item
oct
EXPR
=item
oct
Interprets EXPR as an octal string and returns the corresponding
value. (If EXPR happens to start off
with
C<0x>, interprets it as a
hex
string. If EXPR starts off
with
C<0b>, it is interpreted as a
binary string. Leading whitespace is ignored in all three cases.)
The following will handle decimal, binary, octal, and
hex
in the standard
Perl or C notation:
$val
=
oct
(
$val
)
if
$val
=~ /^0/;
If EXPR is omitted, uses C<
$_
>. To go the other way (produce a number
in octal),
use
sprintf
() or
printf
():
$perms
= (
stat
(
"filename"
))[2] & 07777;
$oct_perms
=
sprintf
"%lo"
,
$perms
;
The
oct
() function is commonly used
when
a string such as C<644> needs
to be converted into a file mode,
for
example. (Although perl will
automatically convert strings into numbers as needed, this automatic
conversion assumes base 10.)
=item
open
FILEHANDLE,EXPR
=item
open
FILEHANDLE,MODE,EXPR
=item
open
FILEHANDLE,MODE,EXPR,LIST
=item
open
FILEHANDLE,MODE,REFERENCE
=item
open
FILEHANDLE
Opens the file whose filename is
given
by EXPR, and associates it
with
FILEHANDLE.
(The following is a comprehensive reference to
open
():
for
a gentler
introduction you may consider L<perlopentut>.)
If FILEHANDLE is an undefined
scalar
variable (or array or hash element)
the variable is assigned a reference to a new anonymous filehandle,
otherwise
if
FILEHANDLE is an expression, its value is used as the name of
the real filehandle wanted. (This is considered a symbolic reference, so
C<
use
strict
'refs'
> should I<not> be in effect.)
If EXPR is omitted, the
scalar
variable of the same name as the
FILEHANDLE contains the filename. (Note that lexical variables--those
declared
with
C<
my
>--will not work
for
this purpose; so
if
you're
using C<
my
>, specify EXPR in your call to
open
.)
If three or more arguments are specified then the mode of opening and
the file name are separate. If MODE is C<<
'<'
>> or nothing, the file
is opened
for
input. If MODE is C<<
'>'
>>, the file is truncated and
opened
for
output, being created
if
necessary. If MODE is C<<<
'>>'
>>>,
the file is opened
for
appending, again being created
if
necessary.
You can put a C<
'+'
> in front of the C<<
'>'
>> or C<<
'<'
>> to
indicate that you want both
read
and
write
access to the file; thus
C<<
'+<'
>> is almost always preferred
for
read
/
write
updates--the C<<
'+>'
>> mode would clobber the file first. You can't usually
use
either
read
-
write
mode
for
updating textfiles, since they have
variable
length
records. See the B<-i> switch in L<perlrun>
for
a
better approach. The file is created
with
permissions of C<0666>
modified by the process' C<
umask
> value.
These various prefixes correspond to the fopen(3) modes of C<
'r'
>,
C<
'r+'
>, C<
'w'
>, C<
'w+'
>, C<
'a'
>, and C<
'a+'
>.
In the 2-arguments (and 1-argument) form of the call the mode and
filename should be concatenated (in this order), possibly separated by
spaces. It is possible to omit the mode in these forms
if
the mode is
C<<
'<'
>>.
If the filename begins
with
C<
'|'
>, the filename is interpreted as a
command to which output is to be piped, and
if
the filename ends
with
a
C<
'|'
>, the filename is interpreted as a command which pipes output to
us. See L<perlipc/
"Using open() for IPC"
>
for
more examples of this. (You are not allowed to C<
open
> to a command
that pipes both in I<and> out, but see L<IPC::Open2>, L<IPC::Open3>,
and L<perlipc/
"Bidirectional Communication with Another Process"
>
for
alternatives.)
For three or more arguments
if
MODE is C<
'|-'
>, the filename is
interpreted as a command to which output is to be piped, and
if
MODE
is C<
'-|'
>, the filename is interpreted as a command which pipes
output to us. In the 2-arguments (and 1-argument) form one should
replace dash (C<
'-'
>)
with
the command.
See L<perlipc/
"Using open() for IPC"
>
for
more examples of this.
(You are not allowed to C<
open
> to a command that pipes both in I<and>
out, but see L<IPC::Open2>, L<IPC::Open3>, and
L<perlipc/
"Bidirectional Communication"
>
for
alternatives.)
In the three-or-more argument form of
pipe
opens,
if
LIST is specified
(extra arguments
after
the command name) then LIST becomes arguments
to the command invoked
if
the platform supports it. The meaning of
C<
open
>
with
more than three arguments
for
non-
pipe
modes is not yet
specified. Experimental
"layers"
may give extra LIST arguments
meaning.
In the 2-arguments (and 1-argument) form opening C<
'-'
> opens STDIN
and opening C<<
'>-'
>> opens STDOUT.
You may
use
the three-argument form of
open
to specify IO
"layers"
(sometimes also referred to as
"disciplines"
) to be applied to the handle
that affect how the input and output are processed (see L<
open
> and
L<PerlIO>
for
more details). For example
open
(FH,
"<:utf8"
,
"file"
)
will
open
the UTF-8 encoded file containing Unicode characters,
see L<perluniintro>. (Note that
if
layers are specified in the
three-arg form then
default
layers set by the C<
open
> pragma are
ignored.)
Open returns nonzero upon success, the undefined value otherwise. If
the C<
open
> involved a
pipe
, the
return
value happens to be the pid of
the subprocess.
If you're running Perl on a
system
that distinguishes between text
files and binary files, then you should check out L</
binmode
>
for
tips
for
dealing
with
this. The key distinction between systems that need
C<
binmode
> and those that don't is their text file formats. Systems
like Unix, Mac OS, and Plan 9, which delimit lines
with
a single
character, and which encode that character in C as C<
"\n"
>,
do
not
need C<
binmode
>. The rest need it.
When opening a file, it's usually a bad idea to
continue
normal execution
if
the request failed, so C<
open
> is frequently used in connection
with
C<
die
>. Even
if
C<
die
> won't
do
what you want (
say
, in a CGI script,
where you want to make a nicely formatted error message (but there are
modules that can help
with
that problem)) you should always check
the
return
value from opening a file. The infrequent exception is
when
working
with
an unopened filehandle is actually what you want to
do
.
As a special case the 3-arg form
with
a
read
/
write
mode and the third
argument being C<
undef
>:
open
(TMP,
"+>"
,
undef
) or
die
...
opens a filehandle to an anonymous temporary file. Also using
"+<"
works
for
symmetry, but you really should consider writing something
to the temporary file first. You will need to
seek
() to
do
the
reading.
Since v5.8.0, perl
has
built using PerlIO by
default
. Unless you've
changed this (i.e. Configure -Uuseperlio), you can
open
file handles to
"in memory"
files held in Perl scalars via:
open
(
$fh
,
'>'
, \
$variable
) || ..
Though
if
you
try
to re-
open
C<STDOUT> or C<STDERR> as an
"in memory"
file, you have to
close
it first:
close
STDOUT;
open
STDOUT,
'>'
, \
$variable
or
die
"Can't open STDOUT: $!"
;
Examples:
$ARTICLE
= 100;
open
ARTICLE or
die
"Can't find article $ARTICLE: $!\n"
;
while
(<ARTICLE>) {...
open
(LOG,
'>>/usr/spool/news/twitlog'
);
open
(DBASE,
'+<'
,
'dbase.mine'
)
or
die
"Can't open 'dbase.mine' for update: $!"
;
open
(DBASE,
'+<dbase.mine'
)
or
die
"Can't open 'dbase.mine' for update: $!"
;
open
(ARTICLE,
'-|'
,
"caesar <$article"
)
or
die
"Can't start caesar: $!"
;
open
(ARTICLE,
"caesar <$article |"
)
or
die
"Can't start caesar: $!"
;
open
(EXTRACT,
"|sort >Tmp$$"
)
or
die
"Can't start sort: $!"
;
open
(MEMORY,
'>'
, \
$var
)
or
die
"Can't open memory file: $!"
;
print
MEMORY
"foo!\n"
;
foreach
$file
(
@ARGV
) {
process(
$file
,
'fh00'
);
}
sub
process {
my
(
$filename
,
$input
) =
@_
;
$input
++;
unless
(
open
(
$input
,
$filename
)) {
print
STDERR
"Can't open $filename: $!\n"
;
return
;
}
local
$_
;
while
(<
$input
>) {
if
(/^
process($1,
$input
);
next
;
}
}
}
See L<perliol>
for
detailed info on PerlIO.
You may also, in the Bourne shell tradition, specify an EXPR beginning
with
C<<
'>&'
>>, in which case the rest of the string is interpreted
as the name of a filehandle (or file descriptor,
if
numeric) to be
duped (as L<dup(2)>) and opened. You may
use
C<&>
after
C<< > >>,
C<<< >> >>>, C<< < >>, C<< +> >>, C<<< +>> >>>, and C<< +< >>.
The mode you specify should match the mode of the original filehandle.
(Duping a filehandle does not take into account any existing contents
of IO buffers.) If you
use
the 3-arg form then you can pass either a
number, the name of a filehandle or the normal
"reference to a glob"
.
Here is a script that saves, redirects, and restores C<STDOUT> and
C<STDERR> using various methods:
open
my
$oldout
,
">&STDOUT"
or
die
"Can't dup STDOUT: $!"
;
open
OLDERR,
">&"
, \
*STDERR
or
die
"Can't dup STDERR: $!"
;
open
STDOUT,
'>'
,
"foo.out"
or
die
"Can't redirect STDOUT: $!"
;
open
STDERR,
">&STDOUT"
or
die
"Can't dup STDOUT: $!"
;
select
STDERR; $| = 1;
select
STDOUT; $| = 1;
print
STDOUT
"stdout 1\n"
;
print
STDERR
"stderr 1\n"
;
open
STDOUT,
">&"
,
$oldout
or
die
"Can't dup \$oldout: $!"
;
open
STDERR,
">&OLDERR"
or
die
"Can't dup OLDERR: $!"
;
print
STDOUT
"stdout 2\n"
;
print
STDERR
"stderr 2\n"
;
If you specify C<<
'<&=X'
>>, where C<X> is a file descriptor number
or a filehandle, then Perl will
do
an equivalent of C's C<fdopen> of
that file descriptor (and not call L<dup(2)>); this is more
parsimonious of file descriptors. For example:
open
(FILEHANDLE,
"<&=$fd"
)
or
open
(FILEHANDLE,
"<&="
,
$fd
)
or
open
(FH,
">>&="
, OLDFH)
or
open
(FH,
">>&=OLDFH"
)
Being parsimonious on filehandles is also useful (besides being
parsimonious)
for
example
when
something is dependent on file
descriptors, like
for
example locking using
flock
(). If you
do
just
C<<
open
(A,
'>>&B'
) >>, the filehandle A will not have the same file
descriptor as B, and therefore
flock
(A) will not
flock
(B), and vice
versa. But
with
C<<
open
(A,
'>>&=B'
) >> the filehandles will share
the same file descriptor.
Note that
if
you are using Perls older than 5.8.0, Perl will be using
the standard C libraries' fdopen() to implement the
"="
functionality.
On many UNIX systems fdopen() fails
when
file descriptors exceed a
certain value, typically 255. For Perls 5.8.0 and later, PerlIO is
most often the
default
.
You can see whether Perl
has
been compiled
with
PerlIO or not by
running C<perl -V> and looking
for
C<
useperlio
=> line. If C<useperlio>
is C<define>, you have PerlIO, otherwise you don't.
If you
open
a
pipe
on the command C<
'-'
>, i.e., either C<
'|-'
> or C<
'-|'
>
with
2-arguments (or 1-argument) form of
open
(), then
there is an implicit
fork
done, and the
return
value of
open
is the pid
of the child within the parent process, and C<0> within the child
process. (Use C<
defined
(
$pid
)> to determine whether the
open
was successful.)
The filehandle behaves normally
for
the parent, but i/o to that
filehandle is piped from/to the STDOUT/STDIN of the child process.
In the child process the filehandle isn't opened--i/o happens from/to
the new STDOUT or STDIN. Typically this is used like the normal
piped
open
when
you want to exercise more control over just how the
pipe
command gets executed, such as
when
you are running setuid, and
don't want to have to scan shell commands
for
metacharacters.
The following triples are more or less equivalent:
open
(FOO,
"|tr '[a-z]' '[A-Z]'"
);
open
(FOO,
'|-'
,
"tr '[a-z]' '[A-Z]'"
);
open
(FOO,
'|-'
) ||
exec
'tr'
,
'[a-z]'
,
'[A-Z]'
;
open
(FOO,
'|-'
,
"tr"
,
'[a-z]'
,
'[A-Z]'
);
open
(FOO,
"cat -n '$file'|"
);
open
(FOO,
'-|'
,
"cat -n '$file'"
);
open
(FOO,
'-|'
) ||
exec
'cat'
,
'-n'
,
$file
;
open
(FOO,
'-|'
,
"cat"
,
'-n'
,
$file
);
The
last
example in
each
block shows the
pipe
as
"list form"
, which is
not yet supported on all platforms. A good rule of thumb is that
if
your platform
has
true C<
fork
()> (in other words,
if
your platform is
UNIX) you can
use
the list form.
See L<perlipc/
"Safe Pipe Opens"
>
for
more examples of this.
Beginning
with
v5.6.0, Perl will attempt to flush all files opened
for
output
before
any operation that may
do
a
fork
, but this may not be
supported on some platforms (see L<perlport>). To be safe, you may need
to set C<$|> (
$AUTOFLUSH
in English) or call the C<autoflush()> method
of C<IO::Handle> on any
open
handles.
On systems that support a
close
-on-
exec
flag on files, the flag will
be set
for
the newly opened file descriptor as determined by the value
of $^F. See L<perlvar/$^F>.
Closing any piped filehandle causes the parent process to
wait
for
the
child to finish, and returns the status value in C<$?>.
The filename passed to 2-argument (or 1-argument) form of
open
() will
have leading and trailing whitespace deleted, and the normal
redirection characters honored. This property, known as
"magic open"
,
can often be used to good effect. A user could specify a filename of
F<
"rsh cat file |"
>, or you could change certain filenames as needed:
$filename
=~ s/(.*\.gz)\s*$/gzip -dc < $1|/;
open
(FH,
$filename
) or
die
"Can't open $filename: $!"
;
Use 3-argument form to
open
a file
with
arbitrary weird characters in it,
open
(FOO,
'<'
,
$file
);
otherwise it's necessary to protect any leading and trailing whitespace:
$file
=~ s
open
(FOO,
"< $file\0"
);
(this may not work on some bizarre filesystems). One should
conscientiously choose between the I<magic> and 3-arguments form
of
open
():
open
IN,
$ARGV
[0];
will allow the user to specify an argument of the form C<
"rsh cat file |"
>,
but will not work on a filename which happens to have a trailing space,
while
open
IN,
'<'
,
$ARGV
[0];
will have exactly the opposite restrictions.
If you want a
"real"
C C<
open
> (see L<
open
(2)> on your
system
), then you
should
use
the C<
sysopen
> function, which involves
no
such magic (but
may
use
subtly different filemodes than Perl
open
(), which is mapped
to C fopen()). This is
another way to protect your filenames from interpretation. For example:
sysopen
(HANDLE,
$path
, O_RDWR|O_CREAT|O_EXCL)
or
die
"sysopen $path: $!"
;
$oldfh
=
select
(HANDLE); $| = 1;
select
(
$oldfh
);
print
HANDLE
"stuff $$\n"
;
seek
(HANDLE, 0, 0);
print
"File contains: "
, <HANDLE>;
Using the constructor from the C<IO::Handle>
package
(or one of its
subclasses, such as C<IO::File> or C<IO::Socket>), you can generate anonymous
filehandles that have the scope of whatever variables hold references to
them, and automatically
close
whenever and however you leave that scope:
sub
read_myfile_munged {
my
$ALL
=
shift
;
my
$handle
= new IO::File;
open
(
$handle
,
"myfile"
) or
die
"myfile: $!"
;
$first
= <
$handle
>
or
return
();
mung
$first
or
die
"mung failed"
;
return
$first
, <
$handle
>
if
$ALL
;
$first
;
}
See L</
seek
>
for
some details about mixing reading and writing.
=item
opendir
DIRHANDLE,EXPR
Opens a directory named EXPR
for
processing by C<
readdir
>, C<
telldir
>,
C<
seekdir
>, C<
rewinddir
>, and C<
closedir
>. Returns true
if
successful.
DIRHANDLE may be an expression whose value can be used as an indirect
dirhandle, usually the real dirhandle name. If DIRHANDLE is an undefined
scalar
variable (or array or hash element), the variable is assigned a
reference to a new anonymous dirhandle.
DIRHANDLEs have their own namespace separate from FILEHANDLEs.
=item
ord
EXPR
=item
ord
Returns the numeric (the native 8-bit encoding, like ASCII or EBCDIC,
or Unicode) value of the first character of EXPR. If EXPR is omitted,
uses C<
$_
>.
For the
reverse
, see L</
chr
>.
See L<perlunicode> and L<encoding>
for
more about Unicode.
=item
our
EXPR
=item
our
EXPR TYPE
=item
our
EXPR : ATTRS
=item
our
TYPE EXPR : ATTRS
C<
our
> associates a simple name
with
a
package
variable in the current
package
for
use
within the current scope. When C<
use
strict
'vars'
> is in
effect, C<
our
> lets you
use
declared global variables without qualifying
them
with
package
names, within the lexical scope of the C<
our
> declaration.
In this way C<
our
> differs from C<
use
vars>, which is
package
scoped.
Unlike C<
my
>, which both allocates storage
for
a variable and associates
a simple name
with
that storage
for
use
within the current scope, C<
our
>
associates a simple name
with
a
package
variable in the current
package
,
for
use
within the current scope. In other words, C<
our
>
has
the same
scoping rules as C<
my
>, but does not necessarily create a
variable.
If more than one value is listed, the list must be placed
in parentheses.
our
$foo
;
our
(
$bar
,
$baz
);
An C<
our
> declaration declares a global variable that will be visible
across its entire lexical scope, even across
package
boundaries. The
package
in which the variable is entered is determined at the point
of the declaration, not at the point of
use
. This means the following
behavior holds:
our
$bar
;
$bar
= 20;
print
$bar
;
Multiple C<
our
> declarations
with
the same name in the same lexical
scope are allowed
if
they are in different packages. If they happen
to be in the same
package
, Perl will emit warnings
if
you have asked
for
them, just like multiple C<
my
> declarations. Unlike a second
C<
my
> declaration, which will
bind
the name to a fresh variable, a
second C<
our
> declaration in the same
package
, in the same scope, is
merely redundant.
our
$bar
;
$bar
= 20;
our
$bar
= 30;
print
$bar
;
our
$bar
;
print
$bar
;
An C<
our
> declaration may also have a list of attributes associated
with
it.
The exact semantics and interface of TYPE and ATTRS are still
evolving. TYPE is currently bound to the
use
of C<fields> pragma,
and attributes are handled using the C<attributes> pragma, or starting
from Perl 5.8.0 also via the C<Attribute::Handlers> module. See
L<perlsub/
"Private Variables via my()"
>
for
details, and L<fields>,
L<attributes>, and L<Attribute::Handlers>.
The only currently recognized C<
our
()> attribute is C<unique> which
indicates that a single copy of the global is to be used by all
interpreters should the program happen to be running in a
multi-interpreter environment. (The
default
behaviour would be
for
each
interpreter to have its own copy of the global.) Examples:
our
@EXPORT
: unique =
qw(foo)
;
our
%EXPORT_TAGS
: unique = (
bar
=> [
qw(aa bb cc)
]);
our
$VERSION
: unique =
"1.00"
;
Note that this attribute also
has
the effect of making the global
readonly
when
the first new interpreter is cloned (
for
example,
when
the first new thread is created).
Multi-interpreter environments can come to being either through the
fork
() emulation on Windows platforms, or by embedding perl in a
multi-threaded application. The C<unique> attribute does nothing in
all other environments.
Warning: the current implementation of this attribute operates on the
typeglob associated
with
the variable; this means that C<
our
$x
: unique>
also
has
the effect of C<
our
@x
: unique;
our
%x
: unique>. This may be
subject to change.
=item
pack
TEMPLATE,LIST
Takes a LIST of
values
and converts it into a string using the rules
given
by the TEMPLATE. The resulting string is the concatenation of
the converted
values
. Typically,
each
converted value looks
like its machine-level representation. For example, on 32-bit machines
a converted integer may be represented by a sequence of 4 bytes.
The TEMPLATE is a sequence of characters that give the order and type
of
values
, as follows:
a A string
with
arbitrary binary data, will be null padded.
A A text (ASCII) string, will be space padded.
Z A null terminated (ASCIZ) string, will be null padded.
b A bit string (ascending bit order inside
each
byte, like
vec
()).
B A bit string (descending bit order inside
each
byte).
h A
hex
string (low nybble first).
H A
hex
string (high nybble first).
c A signed char value.
C An unsigned char value. Only does bytes. See U
for
Unicode.
s A signed short value.
S An unsigned short value.
(This
'short'
is _exactly_ 16 bits, which may differ from
what a
local
C compiler calls
'short'
. If you want
native-
length
shorts,
use
the
'!'
suffix.)
i A signed integer value.
I An unsigned integer value.
(This
'integer'
is _at_least_ 32 bits wide. Its exact
size depends on what a
local
C compiler calls
'int'
,
and may even be larger than the
'long'
described in
the
next
item.)
l A signed long value.
L An unsigned long value.
(This
'long'
is _exactly_ 32 bits, which may differ from
what a
local
C compiler calls
'long'
. If you want
native-
length
longs,
use
the
'!'
suffix.)
n An unsigned short in
"network"
(big-endian) order.
N An unsigned long in
"network"
(big-endian) order.
v An unsigned short in
"VAX"
(little-endian) order.
V An unsigned long in
"VAX"
(little-endian) order.
(These
'shorts'
and
'longs'
are _exactly_ 16 bits and
_exactly_ 32 bits, respectively.)
q
A signed quad (64-bit) value.
Q An unsigned quad value.
(Quads are available only
if
your
system
supports 64-bit
integer
values
_and_
if
Perl
has
been compiled to support those.
Causes a fatal error otherwise.)
j A signed integer value (a Perl internal integer, IV).
J An unsigned integer value (a Perl internal unsigned integer, UV).
f A single-precision float in the native
format
.
d A double-precision float in the native
format
.
F A floating point value in the native native
format
(a Perl internal floating point value, NV).
D A long double-precision float in the native
format
.
(Long doubles are available only
if
your
system
supports long
double
values
_and_
if
Perl
has
been compiled to support those.
Causes a fatal error otherwise.)
p A pointer to a null-terminated string.
P A pointer to a structure (fixed-
length
string).
u A uuencoded string.
U A Unicode character number. Encodes to UTF-8 internally
(or UTF-EBCDIC in EBCDIC platforms).
w A BER compressed integer (not an ASN.1 BER, see perlpacktut
for
details). Its bytes represent an unsigned integer in base 128,
most significant digit first,
with
as few digits as possible. Bit
eight (the high bit) is set on
each
byte except the
last
.
x A null byte.
X Back up a byte.
@ Null fill to absolute position, counted from the start of
the innermost ()-group.
( Start of a ()-group.
The following rules apply:
=over 8
=item *
Each letter may optionally be followed by a number giving a repeat
count. With all types except C<a>, C<A>, C<Z>, C<b>, C<B>, C<h>,
C<H>, C<@>, C<x>, C<X> and C<P> the
pack
function will gobble up that
many
values
from the LIST. A C<*>
for
the repeat count means to
use
however many items are left, except
for
C<@>, C<x>, C<X>, where it is
equivalent to C<0>, and C<u>, where it is equivalent to 1 (or 45, what
is the same). A numeric repeat count may optionally be enclosed in
brackets, as in C<
pack
'C[80]'
,
@arr
>.
One can replace the numeric repeat count by a template enclosed in brackets;
then the packed
length
of this template in bytes is used as a count.
For example, C<x[L]> skips a long (it skips the number of bytes in a long);
the template C<
$t
X[
$t
]
$t
>
unpack
()s twice what
$t
unpacks.
If the template in brackets contains alignment commands (such as C<x![d]>),
its packed
length
is calculated as
if
the start of the template
has
the maximal
possible alignment.
When used
with
C<Z>, C<*> results in the addition of a trailing null
byte (so the packed result will be one longer than the byte C<
length
>
of the item).
The repeat count
for
C<u> is interpreted as the maximal number of bytes
to encode per line of output,
with
0 and 1 replaced by 45.
=item *
The C<a>, C<A>, and C<Z> types gobble just one value, but
pack
it as a
string of
length
count, padding
with
nulls or spaces as necessary. When
unpacking, C<A> strips trailing spaces and nulls, C<Z> strips everything
after
the first null, and C<a> returns data verbatim. When packing,
C<a>, and C<Z> are equivalent.
If the value-to-
pack
is too long, it is truncated. If too long and an
explicit count is provided, C<Z> packs only C<
$count
-1> bytes, followed
by a null byte. Thus C<Z> always packs a trailing null byte under
all circumstances.
=item *
Likewise, the C<b> and C<B> fields
pack
a string that many bits long.
Each byte of the input field of
pack
() generates 1 bit of the result.
Each result bit is based on the least-significant bit of the corresponding
input byte, i.e., on C<
ord
(
$byte
)%2>. In particular, bytes C<
"0"
> and
C<
"1"
> generate bits 0 and 1, as
do
bytes C<
"\0"
> and C<
"\1"
>.
Starting from the beginning of the input string of
pack
(),
each
8-tuple
of bytes is converted to 1 byte of output. With
format
C<b>
the first byte of the 8-tuple determines the least-significant bit of a
byte, and
with
format
C<B> it determines the most-significant bit of
a byte.
If the
length
of the input string is not exactly divisible by 8, the
remainder is packed as
if
the input string were padded by null bytes
at the end. Similarly, during
unpack
()ing the
"extra"
bits are ignored.
If the input string of
pack
() is longer than needed, extra bytes are ignored.
A C<*>
for
the repeat count of
pack
() means to
use
all the bytes of
the input field. On
unpack
()ing the bits are converted to a string
of C<
"0"
>s and C<
"1"
>s.
=item *
The C<h> and C<H> fields
pack
a string that many nybbles (4-bit groups,
representable as hexadecimal digits, 0-9a-f) long.
Each byte of the input field of
pack
() generates 4 bits of the result.
For non-alphabetical bytes the result is based on the 4 least-significant
bits of the input byte, i.e., on C<
ord
(
$byte
)%16>. In particular,
bytes C<
"0"
> and C<
"1"
> generate nybbles 0 and 1, as
do
bytes
C<
"\0"
> and C<
"\1"
>. For bytes C<
"a"
..
"f"
> and C<
"A"
..
"F"
> the result
is compatible
with
the usual hexadecimal digits, so that C<
"a"
> and
C<
"A"
> both generate the nybble C<0xa==10>. The result
for
bytes
C<
"g"
..
"z"
> and C<
"G"
..
"Z"
> is not well-
defined
.
Starting from the beginning of the input string of
pack
(),
each
pair
of bytes is converted to 1 byte of output. With
format
C<h> the
first byte of the pair determines the least-significant nybble of the
output byte, and
with
format
C<H> it determines the most-significant
nybble.
If the
length
of the input string is not even, it behaves as
if
padded
by a null byte at the end. Similarly, during
unpack
()ing the
"extra"
nybbles are ignored.
If the input string of
pack
() is longer than needed, extra bytes are ignored.
A C<*>
for
the repeat count of
pack
() means to
use
all the bytes of
the input field. On
unpack
()ing the bits are converted to a string
of hexadecimal digits.
=item *
The C<p> type packs a pointer to a null-terminated string. You are
responsible
for
ensuring the string is not a temporary value (which can
potentially get deallocated
before
you get
around
to using the packed result).
The C<P> type packs a pointer to a structure of the size indicated by the
length
. A NULL pointer is created
if
the corresponding value
for
C<p> or
C<P> is C<
undef
>, similarly
for
unpack
().
=item *
The C</> template character allows packing and unpacking of strings where
the packed structure contains a byte count followed by the string itself.
You
write
I<
length
-item>C</>I<string-item>.
The I<
length
-item> can be any C<
pack
> template letter, and describes
how the
length
value is packed. The ones likely to be of most
use
are
integer-packing ones like C<n> (
for
Java strings), C<w> (
for
ASN.1 or
SNMP) and C<N> (
for
Sun XDR).
For C<
pack
>, the I<string-item> must, at present, be C<
"A*"
>, C<
"a*"
> or
C<
"Z*"
>. For C<
unpack
> the
length
of the string is obtained from the
I<
length
-item>, but
if
you put in the
'*'
it will be ignored. For all other
codes, C<
unpack
> applies the
length
value to the
next
item, which must not
have a repeat count.
unpack
'C/a'
,
"\04Gurusamy"
; gives
'Guru'
unpack
'a3/A* A*'
,
'007 Bond J '
; gives (
' Bond'
,
'J'
)
pack
'n/a* w/a*'
,
'hello,'
,
'world'
; gives
"\000\006hello,\005world"
The I<
length
-item> is not returned explicitly from C<
unpack
>.
Adding a count to the I<
length
-item> letter is unlikely to
do
anything
useful,
unless
that letter is C<A>, C<a> or C<Z>. Packing
with
a
I<
length
-item> of C<a> or C<Z> may introduce C<
"\000"
> characters,
which Perl does not regard as legal in numeric strings.
=item *
The integer types C<s>, C<S>, C<l>, and C<L> may be
immediately followed by a C<!> suffix to signify native shorts or
longs--as you can see from above
for
example a bare C<l> does mean
exactly 32 bits, the native C<long> (as seen by the
local
C compiler)
may be larger. This is an issue mainly in 64-bit platforms. You can
see whether using C<!> makes any difference by
print
length
(
pack
(
"s"
)),
" "
,
length
(
pack
(
"s!"
)),
"\n"
;
print
length
(
pack
(
"l"
)),
" "
,
length
(
pack
(
"l!"
)),
"\n"
;
C<i!> and C<I!> also work but only because of completeness;
they are identical to C<i> and C<I>.
The actual sizes (in bytes) of native shorts, ints, longs, and long
longs on the platform where Perl was built are also available via
L<Config>:
print
$Config
{shortsize},
"\n"
;
print
$Config
{intsize},
"\n"
;
print
$Config
{longsize},
"\n"
;
print
$Config
{longlongsize},
"\n"
;
(The C<
$Config
{longlongsize}> will be undefined
if
your
system
does
not support long longs.)
=item *
The integer formats C<s>, C<S>, C<i>, C<I>, C<l>, C<L>, C<j>, and C<J>
are inherently non-portable between processors and operating systems
because they obey the native byteorder and endianness. For example a
4-byte integer 0x12345678 (305419896 decimal) would be ordered natively
(arranged in and handled by the CPU registers) into bytes as
0x12 0x34 0x56 0x78
0x78 0x56 0x34 0x12
Basically, the Intel and VAX CPUs are little-endian,
while
everybody
else
,
for
example Motorola m68k/88k, PPC, Sparc, HP PA, Power, and
Cray are big-endian. Alpha and MIPS can be either: Digital/Compaq
used/uses them in little-endian mode; SGI/Cray uses them in big-endian
mode.
The names `big-endian
' and `little-endian'
are comic references to
the classic
"Gulliver's Travels"
(via the paper "On Holy Wars and a
Plea
for
Peace" by Danny Cohen, USC/ISI IEN 137, April 1, 1980) and
the egg-eating habits of the Lilliputians.
Some systems may have even weirder byte orders such as
0x56 0x78 0x12 0x34
0x34 0x12 0x78 0x56
You can see your
system
's preference
with
print
join
(
" "
,
map
{
sprintf
"%#02x"
,
$_
}
unpack
(
"C*"
,
pack
(
"L"
,0x12345678))),
"\n"
;
The byteorder on the platform where Perl was built is also available
via L<Config>:
print
$Config
{byteorder},
"\n"
;
Byteorders C<
'1234'
> and C<
'12345678'
> are little-endian, C<
'4321'
>
and C<
'87654321'
> are big-endian.
If you want portable packed integers
use
the formats C<n>, C<N>,
C<v>, and C<V>, their byte endianness and size are known.
See also L<perlport>.
=item *
Real numbers (floats and doubles) are in the native machine
format
only;
due to the multiplicity of floating formats
around
, and the lack of a
standard
"network"
representation,
no
facility
for
interchange
has
been
made. This means that packed floating point data written on one machine
may not be readable on another - even
if
both
use
IEEE floating point
arithmetic (as the endian-ness of the memory representation is not part
of the IEEE spec). See also L<perlport>.
Note that Perl uses doubles internally
for
all numeric calculation, and
converting from double into float and thence back to double again will
lose precision (i.e., C<
unpack
(
"f"
,
pack
(
"f"
,
$foo
)>) will not in general
equal
$foo
).
=item *
If the pattern begins
with
a C<U>, the resulting string will be
treated as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a
string
with
an initial C<U0>, and the bytes that follow will be
interpreted as Unicode characters. If you don't want this to happen,
you can begin your pattern
with
C<C0> (or anything
else
) to force Perl
not to UTF-8 encode your string, and then follow this
with
a C<U*>
somewhere in your pattern.
=item *
You must yourself
do
any alignment or padding by inserting
for
example
enough C<
'x'
>es
while
packing. There is
no
way to
pack
() and
unpack
()
could know where the bytes are going to or coming from. Therefore
C<
pack
> (and C<
unpack
>) handle their output and input as flat
sequences of bytes.
=item *
A ()-group is a
sub
-TEMPLATE enclosed in parentheses. A group may
take a repeat count, both as postfix, and
for
unpack
() also via the C</>
template character. Within
each
repetition of a group, positioning
with
C<@> starts again at 0. Therefore, the result of
pack
(
'@1A((@2A)@3A)'
,
'a'
,
'b'
,
'c'
)
is the string
"\0a\0\0bc"
.
=item *
C<x> and C<X>
accept
C<!> modifier. In this case they act as
alignment commands: they jump forward/back to the closest position
aligned at a multiple of C<count> bytes. For example, to
pack
() or
unpack
() C's C<struct {char c; double d; char cc[2]}> one may need to
use
the template C<C x![d] d C[2]>; this assumes that doubles must be
aligned on the double's size.
For alignment commands C<count> of 0 is equivalent to C<count> of 1;
both result in
no
-ops.
=item *
A comment in a TEMPLATE starts
with
C<
White space may be used to separate
pack
codes from
each
other, but
a C<!> modifier and a repeat count must follow immediately.
=item *
If TEMPLATE requires more arguments to
pack
() than actually
given
,
pack
()
assumes additional C<
""
> arguments. If TEMPLATE requires fewer arguments
to
pack
() than actually
given
, extra arguments are ignored.
=back
Examples:
$foo
=
pack
(
"CCCC"
,65,66,67,68);
$foo
=
pack
(
"C4"
,65,66,67,68);
$foo
=
pack
(
"U4"
,0x24b6,0x24b7,0x24b8,0x24b9);
$foo
=
pack
(
"ccxxcc"
,65,66,67,68);
$foo
=
pack
(
"s2"
,1,2);
$foo
=
pack
(
"a4"
,
"abcd"
,
"x"
,
"y"
,
"z"
);
$foo
=
pack
(
"aaaa"
,
"abcd"
,
"x"
,
"y"
,
"z"
);
$foo
=
pack
(
"a14"
,
"abcdefg"
);
$foo
=
pack
(
"i9pl"
,
gmtime
);
$utmp_template
=
"Z8 Z8 Z16 L"
;
$utmp
=
pack
(
$utmp_template
,
@utmp1
);
@utmp2
=
unpack
(
$utmp_template
,
$utmp
);
sub
bintodec {
unpack
(
"N"
,
pack
(
"B32"
,
substr
(
"0"
x 32 .
shift
, -32)));
}
$foo
=
pack
(
'sx2l'
, 12, 34);
$bar
=
pack
(
's@4l'
, 12, 34);
The same template may generally also be used in
unpack
().
=item
package
Declares the compilation unit as being in the
given
namespace. The scope
of the
package
declaration is from the declaration itself through the end
of the enclosing block, file, or
eval
(the same as the C<
my
> operator).
All further unqualified dynamic identifiers will be in this namespace.
A
package
statement affects only dynamic variables--including those
you've used C<
local
> on--but I<not> lexical variables, which are created
with
C<
my
>. Typically it would be the first declaration in a file to
be included by the C<
require
> or C<
use
> operator. You can switch into a
package
in more than one place; it merely influences which symbol table
is used by the compiler
for
the rest of that block. You can refer to
variables and filehandles in other packages by prefixing the identifier
with
the
package
name and a double colon: C<
$Package::Variable
>.
If the
package
name is null, the C<main>
package
as assumed. That is,
C<$::sail> is equivalent to C<
$main::sail
> (as well as to C<
$main
'sail>,
still seen in older code).
If NAMESPACE is omitted, then there is
no
current
package
, and all
identifiers must be fully qualified or lexicals. However, you are
strongly advised not to make
use
of this feature. Its
use
can cause
unexpected behaviour, even crashing some versions of Perl. It is
deprecated, and will be removed from a future release.
See L<perlmod/
"Packages"
>
for
more information about packages, modules,
and classes. See L<perlsub>
for
other scoping issues.
=item
pipe
READHANDLE,WRITEHANDLE
Opens a pair of connected pipes like the corresponding
system
call.
Note that
if
you set up a loop of piped processes, deadlock can occur
unless
you are very careful. In addition, note that Perl's pipes
use
IO buffering, so you may need to set C<$|> to flush your WRITEHANDLE
after
each
command, depending on the application.
See L<IPC::Open2>, L<IPC::Open3>, and L<perlipc/
"Bidirectional Communication"
>
for
examples of such things.
On systems that support a
close
-on-
exec
flag on files, the flag will be set
for
the newly opened file descriptors as determined by the value of $^F.
See L<perlvar/$^F>.
=item
pop
ARRAY
=item
pop
Pops and returns the
last
value of the array, shortening the array by
one element. Has an effect similar to
$ARRAY
[
$#ARRAY
--]
If there are
no
elements in the array, returns the undefined value
(although this may happen at other
times
as well). If ARRAY is
omitted, pops the C<
@ARGV
> array in the main program, and the C<
@_
>
array in subroutines, just like C<
shift
>.
=item
pos
SCALAR
=item
pos
Returns the offset of where the
last
C<m//g> search left off
for
the variable
in question (C<
$_
> is used
when
the variable is not specified). Note that
0 is a valid match offset. C<
undef
> indicates that the search position
is
reset
(usually due to match failure, but can also be because
no
match
has
yet been performed on the
scalar
). C<
pos
> directly accesses the location used
by the regexp engine to store the offset, so assigning to C<
pos
> will change
that offset, and so will also influence the C<\G> zero-width assertion in
regular expressions. Because a failed C<m//gc> match doesn't
reset
the offset,
the
return
from C<
pos
> won't change either in this case. See L<perlre> and
L<perlop>.
=item
print
FILEHANDLE LIST
=item
print
LIST
=item
print
Prints a string or a list of strings. Returns true
if
successful.
FILEHANDLE may be a
scalar
variable name, in which case the variable
contains the name of or a reference to the filehandle, thus introducing
one level of indirection. (NOTE: If FILEHANDLE is a variable and
the
next
token is a term, it may be misinterpreted as an operator
unless
you interpose a C<+> or put parentheses
around
the arguments.)
If FILEHANDLE is omitted, prints by
default
to standard output (or
to the
last
selected output channel--see L</
select
>). If LIST is
also omitted, prints C<
$_
> to the currently selected output channel.
To set the
default
output channel to something other than STDOUT
use
the
select
operation. The current value of C<$,> (
if
any) is
printed between
each
LIST item. The current value of C<$\> (
if
any) is printed
after
the entire LIST
has
been printed. Because
print
takes a LIST, anything in the LIST is evaluated in list
context, and any subroutine that you call will have one or more of
its expressions evaluated in list context. Also be careful not to
follow the
print
keyword
with
a left parenthesis
unless
you want
the corresponding right parenthesis to terminate the arguments to
the
print
--interpose a C<+> or put parentheses
around
all the
arguments.
Note that
if
you
're storing FILEHANDLEs in an array, or if you'
re using
any other expression more complex than a
scalar
variable to retrieve it,
you will have to
use
a block returning the filehandle value instead:
print
{
$files
[
$i
] }
"stuff\n"
;
print
{
$OK
? STDOUT : STDERR }
"stuff\n"
;
=item
printf
FILEHANDLE FORMAT, LIST
=item
printf
FORMAT, LIST
Equivalent to C<
print
FILEHANDLE
sprintf
(FORMAT, LIST)>, except that C<$\>
(the output record separator) is not appended. The first argument
of the list will be interpreted as the C<
printf
>
format
. See C<
sprintf
>
for
an explanation of the
format
argument. If C<
use
locale> is in effect,
the character used
for
the decimal point in formatted real numbers is
affected by the LC_NUMERIC locale. See L<perllocale>.
Don't fall into the trap of using a C<
printf
>
when
a simple
C<
print
> would
do
. The C<
print
> is more efficient and less
error prone.
=item
prototype
FUNCTION
Returns the
prototype
of a function as a string (or C<
undef
>
if
the
function
has
no
prototype
). FUNCTION is a reference to, or the name of,
the function whose
prototype
you want to retrieve.
If FUNCTION is a string starting
with
C<CORE::>, the rest is taken as a
name
for
Perl builtin. If the builtin is not I<overridable> (such as
C<
qw//
>) or its arguments cannot be expressed by a
prototype
(such as
C<
system
>) returns C<
undef
> because the builtin does not really behave
like a Perl function. Otherwise, the string describing the equivalent
prototype
is returned.
=item
push
ARRAY,LIST
Treats ARRAY as a stack, and pushes the
values
of LIST
onto the end of ARRAY. The
length
of ARRAY increases by the
length
of
LIST. Has the same effect as
for
$value
(LIST) {
$ARRAY
[++
$#ARRAY
] =
$value
;
}
but is more efficient. Returns the new number of elements in the array.
=item
q/STRING/
=item
qq/STRING/
=item
qr/STRING/
=item
qx/STRING/
=item
qw/STRING/
Generalized quotes. See L<perlop/
"Regexp Quote-Like Operators"
>.
=item
quotemeta
EXPR
=item
quotemeta
Returns the value of EXPR
with
all non-
"word"
characters backslashed. (That is, all characters not matching
C</[A-Za-z_0-9]/> will be preceded by a backslash in the
returned string, regardless of any locale settings.)
This is the internal function implementing
the C<\Q> escape in double-quoted strings.
If EXPR is omitted, uses C<
$_
>.
=item
rand
EXPR
=item
rand
Returns a random fractional number greater than or equal to C<0> and less
than the value of EXPR. (EXPR should be positive.) If EXPR is
omitted, the value C<1> is used. Currently EXPR
with
the value C<0> is
also special-cased as C<1> - this
has
not been documented
before
perl 5.8.0
and is subject to change in future versions of perl. Automatically calls
C<
srand
>
unless
C<
srand
>
has
already been called. See also C<
srand
>.
Apply C<
int
()> to the value returned by C<
rand
()>
if
you want random
integers instead of random fractional numbers. For example,
int
(
rand
(10))
returns a random integer between C<0> and C<9>, inclusive.
(Note: If your
rand
function consistently returns numbers that are too
large or too small, then your version of Perl was probably compiled
with
the wrong number of RANDBITS.)
=item
read
FILEHANDLE,SCALAR,LENGTH,OFFSET
=item
read
FILEHANDLE,SCALAR,LENGTH
Attempts to
read
LENGTH I<characters> of data into variable SCALAR
from the specified FILEHANDLE. Returns the number of characters
actually
read
, C<0> at end of file, or
undef
if
there was an error (in
the latter case C<$!> is also set). SCALAR will be grown or shrunk
so that the
last
character actually
read
is the
last
character of the
scalar
after
the
read
.
An OFFSET may be specified to place the
read
data at some place in the
string other than the beginning. A negative OFFSET specifies
placement at that many characters counting backwards from the end of
the string. A positive OFFSET greater than the
length
of SCALAR
results in the string being padded to the required size
with
C<
"\0"
>
bytes
before
the result of the
read
is appended.
The call is actually implemented in terms of either Perl
's or system'
s
fread() call. To get a true
read
(2)
system
call, see C<
sysread
>.
Note the I<characters>: depending on the status of the filehandle,
either (8-bit) bytes or characters are
read
. By
default
all
filehandles operate on bytes, but
for
example
if
the filehandle
has
been opened
with
the C<:utf8> I/O layer (see L</
open
>, and the C<
open
>
pragma, L<
open
>), the I/O will operate on UTF-8 encoded Unicode
characters, not bytes. Similarly
for
the C<:encoding> pragma:
in that case pretty much any characters can be
read
.
=item
readdir
DIRHANDLE
Returns the
next
directory entry
for
a directory opened by C<
opendir
>.
If used in list context, returns all the rest of the entries in the
directory. If there are
no
more entries, returns an undefined value in
scalar
context or a null list in list context.
If you
're planning to filetest the return values out of a C<readdir>, you'
d
better prepend the directory in question. Otherwise, because we didn't
C<
chdir
> there, it would have been testing the wrong file.
opendir
(DIR,
$some_dir
) ||
die
"can't opendir $some_dir: $!"
;
@dots
=
grep
{ /^\./ && -f
"$some_dir/$_"
}
readdir
(DIR);
closedir
DIR;
=item
readline
EXPR
Reads from the filehandle whose typeglob is contained in EXPR. In
scalar
context,
each
call reads and returns the
next
line,
until
end-of-file is
reached, whereupon the subsequent call returns
undef
. In list context,
reads
until
end-of-file is reached and returns a list of lines. Note that
the notion of
"line"
used here is however you may have
defined
it
with
C<$/> or C<
$INPUT_RECORD_SEPARATOR
>). See L<perlvar/
"$/"
>.
When C<$/> is set to C<
undef
>,
when
readline
() is in
scalar
context (i.e. file slurp mode), and
when
an empty file is
read
, it
returns C<
''
> the first
time
, followed by C<
undef
> subsequently.
This is the internal function implementing the C<< <EXPR> >>
operator, but you can
use
it directly. The C<< <EXPR> >>
operator is discussed in more detail in L<perlop/
"I/O Operators"
>.
$line
= <STDIN>;
$line
=
readline
(
*STDIN
);
If
readline
encounters an operating
system
error, C<$!> will be set
with
the
corresponding error message. It can be helpful to check C<$!>
when
you are
reading from filehandles you don't trust, such as a tty or a
socket
. The
following example uses the operator form of C<
readline
>, and takes the necessary
steps to ensure that C<
readline
> was successful.
for
(;;) {
undef
$!;
unless
(
defined
(
$line
= <> )) {
die
$!
if
$!;
last
;
}
}
=item
readlink
EXPR
=item
readlink
Returns the value of a symbolic
link
,
if
symbolic links are
implemented. If not, gives a fatal error. If there is some
system
error, returns the undefined value and sets C<$!> (errno). If EXPR is
omitted, uses C<
$_
>.
=item
readpipe
EXPR
EXPR is executed as a
system
command.
The collected standard output of the command is returned.
In
scalar
context, it comes back as a single (potentially
multi-line) string. In list context, returns a list of lines
(however you've
defined
lines
with
C<$/> or C<
$INPUT_RECORD_SEPARATOR
>).
This is the internal function implementing the C<
qx/EXPR/
>
operator, but you can
use
it directly. The C<
qx/EXPR/
>
operator is discussed in more detail in L<perlop/
"I/O Operators"
>.
=item
recv
SOCKET,SCALAR,LENGTH,FLAGS
Receives a message on a
socket
. Attempts to receive LENGTH characters
of data into variable SCALAR from the specified SOCKET filehandle.
SCALAR will be grown or shrunk to the
length
actually
read
. Takes the
same flags as the
system
call of the same name. Returns the address
of the sender
if
SOCKET's protocol supports this; returns an empty
string otherwise. If there's an error, returns the undefined value.
This call is actually implemented in terms of recvfrom(2)
system
call.
See L<perlipc/
"UDP: Message Passing"
>
for
examples.
Note the I<characters>: depending on the status of the
socket
, either
(8-bit) bytes or characters are received. By
default
all sockets
operate on bytes, but
for
example
if
the
socket
has
been changed using
binmode
() to operate
with
the C<:utf8> I/O layer (see the C<
open
>
pragma, L<
open
>), the I/O will operate on UTF-8 encoded Unicode
characters, not bytes. Similarly
for
the C<:encoding> pragma:
in that case pretty much any characters can be
read
.
=item
redo
LABEL
=item
redo
The C<
redo
> command restarts the loop block without evaluating the
conditional again. The C<
continue
> block,
if
any, is not executed. If
the LABEL is omitted, the command refers to the innermost enclosing
loop. Programs that want to lie to themselves about what was just input
normally
use
this command:
LINE:
while
(<STDIN>) {
while
(s|({.*}.*){.*}|$1 |) {}
s|{.*}| |;
if
(s|{.*| |) {
$front
=
$_
;
while
(<STDIN>) {
if
(/}/) {
s|^|
$front
\{|;
redo
LINE;
}
}
}
print
;
}
C<
redo
> cannot be used to retry a block which returns a value such as
C<
eval
{}>, C<
sub
{}> or C<
do
{}>, and should not be used to
exit
a
grep
() or
map
() operation.
Note that a block by itself is semantically identical to a loop
that executes once. Thus C<
redo
> inside such a block will effectively
turn it into a looping construct.
See also L</
continue
>
for
an illustration of how C<
last
>, C<
next
>, and
C<
redo
> work.
=item
ref
EXPR
=item
ref
Returns a non-empty string
if
EXPR is a reference, the empty
string otherwise. If EXPR
is not specified, C<
$_
> will be used. The value returned depends on the
type of thing the reference is a reference to.
Builtin types include:
SCALAR
ARRAY
HASH
CODE
REF
GLOB
LVALUE
If the referenced object
has
been blessed into a
package
, then that
package
name is returned instead. You can think of C<
ref
> as a C<typeof> operator.
if
(
ref
(
$r
) eq
"HASH"
) {
print
"r is a reference to a hash.\n"
;
}
unless
(
ref
(
$r
)) {
print
"r is not a reference at all.\n"
;
}
See also L<perlref>.
=item
rename
OLDNAME,NEWNAME
Changes the name of a file; an existing file NEWNAME will be
clobbered. Returns true
for
success, false otherwise.
Behavior of this function varies wildly depending on your
system
implementation. For example, it will usually not work across file
system
boundaries, even though the
system
I<mv> command sometimes compensates
for
this. Other restrictions include whether it works on directories,
open
files, or pre-existing files. Check L<perlport> and either the
rename
(2) manpage or equivalent
system
documentation
for
details.
=item
require
Demands a version of Perl specified by VERSION, or demands some semantics
specified by EXPR or by C<
$_
>
if
EXPR is not supplied.
VERSION may be either a numeric argument such as 5.006, which will be
compared to C<$]>, or a literal of the form v5.6.1, which will be compared
to C<$^V> (aka
$PERL_VERSION
). A fatal error is produced at run
time
if
VERSION is greater than the version of the current Perl interpreter.
Compare
with
L</
use
>, which can
do
a similar check at compile
time
.
Specifying VERSION as a literal of the form v5.6.1 should generally be
avoided, because it leads to misleading error messages under earlier
versions of Perl that
do
not support this syntax. The equivalent numeric
version should be used instead.
require
5.6.1;
require
5.006_001;
Otherwise, C<
ref
> demands that a library file be included
if
it hasn't already
been included. The file is included via the
do
-FILE mechanism, which is
essentially just a variety of C<
eval
>. Has semantics similar to the
following subroutine:
sub
require
{
my
(
$filename
) =
@_
;
if
(
exists
$INC
{
$filename
}) {
return
1
if
$INC
{
$filename
};
die
"Compilation failed in require"
;
}
my
(
$realfilename
,
$result
);
ITER: {
foreach
$prefix
(
@INC
) {
$realfilename
=
"$prefix/$filename"
;
if
(-f
$realfilename
) {
$INC
{
$filename
} =
$realfilename
;
$result
=
do
$realfilename
;
last
ITER;
}
}
die
"Can't find $filename in \@INC"
;
}
if
($@) {
$INC
{
$filename
} =
undef
;
die
$@;
}
elsif
(!
$result
) {
delete
$INC
{
$filename
};
die
"$filename did not return true value"
;
}
else
{
return
$result
;
}
}
Note that the file will not be included twice under the same specified
name.
The file must
return
true as the
last
statement to indicate
successful execution of any initialization code, so it's customary to
end such a file
with
C<1;>
unless
you
're sure it'
ll
return
true
otherwise. But it's better just to put the C<1;>, in case you add more
statements.
If EXPR is a bareword, the
require
assumes a
"F<.pm>"
extension and
replaces
"F<::>"
with
"F</>"
in the filename
for
you,
to make it easy to load standard modules. This form of loading of
modules does not risk altering your namespace.
In other words,
if
you
try
this:
The
require
function will actually look
for
the
"F<Foo/Bar.pm>"
file in the
directories specified in the C<
@INC
> array.
But
if
you
try
this:
$class
=
'Foo::Bar'
;
require
$class
;
require
"Foo::Bar"
;
The
require
function will look
for
the
"F<Foo::Bar>"
file in the
@INC
array and
will complain about not finding
"F<Foo::Bar>"
there. In this case you can
do
:
eval
"require $class"
;
Now that you understand how C<
require
> looks
for
files in the case of
a bareword argument, there is a little extra functionality going on
behind the scenes. Before C<
require
> looks
for
a
"F<.pm>"
extension,
it will first look
for
a filename
with
a
"F<.pmc>"
extension. A file
with
this extension is assumed to be Perl bytecode generated by
L<B::Bytecode|B::Bytecode>. If this file is found, and its modification
time
is newer than a coinciding
"F<.pm>"
non-compiled file, it will be
loaded in place of that non-compiled file ending in a
"F<.pm>"
extension.
You can also insert hooks into the
import
facility, by putting directly
Perl code into the
@INC
array. There are three forms of hooks: subroutine
references, array references and blessed objects.
Subroutine references are the simplest case. When the inclusion
system
walks through
@INC
and encounters a subroutine, this subroutine gets
called
with
two parameters, the first being a reference to itself, and the
second the name of the file to be included (e.g.
"F<Foo/Bar.pm>"
). The
subroutine should
return
C<
undef
> or a filehandle, from which the file to
include will be
read
. If C<
undef
> is returned, C<
require
> will look at
the remaining elements of
@INC
.
If the hook is an array reference, its first element must be a subroutine
reference. This subroutine is called as above, but the first parameter is
the array reference. This enables to pass indirectly some arguments to
the subroutine.
In other words, you can
write
:
push
@INC
, \
&my_sub
;
sub
my_sub {
my
(
$coderef
,
$filename
) =
@_
;
...
}
or:
push
@INC
, [ \
&my_sub
,
$x
,
$y
, ... ];
sub
my_sub {
my
(
$arrayref
,
$filename
) =
@_
;
my
@parameters
=
@$arrayref
[1..
$#$arrayref
];
...
}
If the hook is an object, it must provide an INC method that will be
called as above, the first parameter being the object itself. (Note that
you must fully qualify the
sub
's name, as it is always forced into
package
C<main>.) Here is a typical code layout:
sub
new { ... }
sub
Foo::INC {
my
(
$self
,
$filename
) =
@_
;
...
}
push
@INC
, new Foo(...);
Note that these hooks are also permitted to set the
%INC
entry
corresponding to the files they have loaded. See L<perlvar/
%INC
>.
For a yet-more-powerful
import
facility, see L</
use
> and L<perlmod>.
=item
reset
EXPR
=item
reset
Generally used in a C<
continue
> block at the end of a loop to clear
variables and
reset
C<??> searches so that they work again. The
expression is interpreted as a list of single characters (hyphens
allowed
for
ranges). All variables and arrays beginning
with
one of
those letters are
reset
to their pristine state. If the expression is
omitted, one-match searches (C<?pattern?>) are
reset
to match again. Resets
only variables or searches in the current
package
. Always returns
1. Examples:
reset
'X'
;
reset
'a-z'
;
reset
;
Resetting C<
"A-Z"
> is not recommended because you'll wipe out your
C<
@ARGV
> and C<
@INC
> arrays and your C<
%ENV
> hash. Resets only
package
variables--lexical variables are unaffected, but they clean themselves
up on scope
exit
anyway, so you'll probably want to
use
them instead.
See L</
my
>.
=item
return
EXPR
=item
return
Returns from a subroutine, C<
eval
>, or C<
do
FILE>
with
the value
given
in EXPR. Evaluation of EXPR may be in list,
scalar
, or void
context, depending on how the
return
value will be used, and the context
may vary from one execution to the
next
(see C<
wantarray
>). If
no
EXPR
is
given
, returns an empty list in list context, the undefined value in
scalar
context, and (of course) nothing at all in a void context.
(Note that in the absence of an explicit C<
return
>, a subroutine,
eval
,
or
do
FILE will automatically
return
the value of the
last
expression
evaluated.)
=item
reverse
LIST
In list context, returns a list value consisting of the elements
of LIST in the opposite order. In
scalar
context, concatenates the
elements of LIST and returns a string value
with
all characters
in the opposite order.
print
reverse
<>;
undef
$/;
print
scalar
reverse
<>;
Used without arguments in
scalar
context,
reverse
() reverses C<
$_
>.
This operator is also handy
for
inverting a hash, although there are some
caveats. If a value is duplicated in the original hash, only one of those
can be represented as a key in the inverted hash. Also, this
has
to
unwind one hash and build a whole new one, which may take some
time
on a large hash, such as from a DBM file.
%by_name
=
reverse
%by_address
;
=item
rewinddir
DIRHANDLE
Sets the current position to the beginning of the directory
for
the
C<
readdir
> routine on DIRHANDLE.
=item
rindex
STR,SUBSTR,POSITION
=item
rindex
STR,SUBSTR
Works just like
index
() except that it returns the position of the LAST
occurrence of SUBSTR in STR. If POSITION is specified, returns the
last
occurrence at or
before
that position.
=item
rmdir
FILENAME
=item
rmdir
Deletes the directory specified by FILENAME
if
that directory is
empty. If it succeeds it returns true, otherwise it returns false and
sets C<$!> (errno). If FILENAME is omitted, uses C<
$_
>.
=item s///
The substitution operator. See L<perlop>.
=item
scalar
EXPR
Forces EXPR to be interpreted in
scalar
context and returns the value
of EXPR.
@counts
= (
scalar
@a
,
scalar
@b
,
scalar
@c
);
There is
no
equivalent operator to force an expression to
be interpolated in list context because in practice, this is never
needed. If you really wanted to
do
so, however, you could
use
the construction C<@{[ (some expression) ]}>, but usually a simple
C<(some expression)> suffices.
Because C<
scalar
> is unary operator,
if
you accidentally
use
for
EXPR a
parenthesized list, this behaves as a
scalar
comma expression, evaluating
all but the
last
element in void context and returning the final element
evaluated in
scalar
context. This is seldom what you want.
The following single statement:
print
uc
(
scalar
(
&foo
,
$bar
)),
$baz
;
is the moral equivalent of these two:
&foo
;
print
(
uc
(
$bar
),
$baz
);
See L<perlop>
for
more details on unary operators and the comma operator.
=item
seek
FILEHANDLE,POSITION,WHENCE
Sets FILEHANDLE's position, just like the C<fseek> call of C<stdio>.
FILEHANDLE may be an expression whose value gives the name of the
filehandle. The
values
for
WHENCE are C<0> to set the new position
I<in bytes> to POSITION, C<1> to set it to the current position plus
POSITION, and C<2> to set it to EOF plus POSITION (typically
negative). For WHENCE you may
use
the constants C<SEEK_SET>,
C<SEEK_CUR>, and C<SEEK_END> (start of the file, current position, end
of the file) from the Fcntl module. Returns C<1> upon success, C<0>
otherwise.
Note the I<in bytes>: even
if
the filehandle
has
been set to
operate on characters (
for
example by using the C<:utf8>
open
layer),
tell
() will
return
byte offsets, not character offsets
(because implementing that would render
seek
() and
tell
() rather slow).
If you want to position file
for
C<
sysread
> or C<
syswrite
>, don't
use
C<
seek
>--buffering makes its effect on the file's
system
position
unpredictable and non-portable. Use C<
sysseek
> instead.
Due to the rules and rigors of ANSI C, on some systems you have to
do
a
seek
whenever you switch between reading and writing. Amongst other
things, this may have the effect of calling stdio's clearerr(3).
A WHENCE of C<1> (C<SEEK_CUR>) is useful
for
not moving the file position:
seek
(TEST,0,1);
This is also useful
for
applications emulating C<tail -f>. Once you hit
EOF on your
read
, and then
sleep
for
a
while
, you might have to stick in a
seek
() to
reset
things. The C<
seek
> doesn't change the current position,
but it I<does> clear the end-of-file condition on the handle, so that the
next
C<< <FILE> >> makes Perl
try
again to
read
something. We hope.
If that doesn't work (some IO implementations are particularly
cantankerous), then you may need something more like this:
for
(;;) {
for
(
$curpos
=
tell
(FILE);
$_
= <FILE>;
$curpos
=
tell
(FILE)) {
}
sleep
(
$for_a_while
);
seek
(FILE,
$curpos
, 0);
}
=item
seekdir
DIRHANDLE,POS
Sets the current position
for
the C<
readdir
> routine on DIRHANDLE. POS
must be a value returned by C<
telldir
>. C<
seekdir
> also
has
the same caveats
about possible directory compaction as the corresponding
system
library
routine.
=item
select
FILEHANDLE
=item
select
Returns the currently selected filehandle. Sets the current
default
filehandle
for
output,
if
FILEHANDLE is supplied. This
has
two
effects: first, a C<
write
> or a C<
print
> without a filehandle will
default
to this FILEHANDLE. Second, references to variables related to
output will refer to this output channel. For example,
if
you have to
set the top of form
format
for
more than one output channel, you might
do
the following:
select
(REPORT1);
$^ =
'report1_top'
;
select
(REPORT2);
$^ =
'report2_top'
;
FILEHANDLE may be an expression whose value gives the name of the
actual filehandle. Thus:
$oldfh
=
select
(STDERR); $| = 1;
select
(
$oldfh
);
Some programmers may prefer to think of filehandles as objects
with
methods, preferring to
write
the
last
example as:
STDERR->autoflush(1);
=item
select
RBITS,WBITS,EBITS,TIMEOUT
This calls the
select
(2)
system
call
with
the bit masks specified, which
can be constructed using C<
fileno
> and C<
vec
>, along these lines:
$rin
=
$win
=
$ein
=
''
;
vec
(
$rin
,
fileno
(STDIN),1) = 1;
vec
(
$win
,
fileno
(STDOUT),1) = 1;
$ein
=
$rin
|
$win
;
If you want to
select
on many filehandles you might wish to
write
a
subroutine:
sub
fhbits {
my
(
@fhlist
) =
split
(
' '
,
$_
[0]);
my
(
$bits
);
for
(
@fhlist
) {
vec
(
$bits
,
fileno
(
$_
),1) = 1;
}
$bits
;
}
$rin
= fhbits(
'STDIN TTY SOCK'
);
The usual idiom is:
(
$nfound
,
$timeleft
) =
select
(
$rout
=
$rin
,
$wout
=
$win
,
$eout
=
$ein
,
$timeout
);
or to block
until
something becomes ready just
do
this
$nfound
=
select
(
$rout
=
$rin
,
$wout
=
$win
,
$eout
=
$ein
,
undef
);
Most systems
do
not bother to
return
anything useful in
$timeleft
, so
calling
select
() in
scalar
context just returns
$nfound
.
Any of the bit masks can also be
undef
. The timeout,
if
specified, is
in seconds, which may be fractional. Note: not all implementations are
capable of returning the
$timeleft
. If not, they always
return
$timeleft
equal to the supplied
$timeout
.
You can effect a
sleep
of 250 milliseconds this way:
select
(
undef
,
undef
,
undef
, 0.25);
Note that whether C<
select
> gets restarted
after
signals (
say
, SIGALRM)
is implementation-dependent. See also L<perlport>
for
notes on the
portability of C<
select
>.
On error, C<
select
> behaves like the
select
(2)
system
call : it returns
-1 and sets C<$!>.
Note: on some Unixes, the
select
(2)
system
call may report a
socket
file
descriptor as
"ready for reading"
,
when
actually
no
data is available,
thus a subsequent
read
blocks. It can be avoided using always the
O_NONBLOCK flag on the
socket
. See
select
(2) and
fcntl
(2)
for
further
details.
B<WARNING>: One should not attempt to mix buffered I/O (like C<
read
>
or <FH>)
with
C<
select
>, except as permitted by POSIX, and even
then only on POSIX systems. You have to
use
C<
sysread
> instead.
=item
semctl
ID,SEMNUM,CMD,ARG
Calls the System V IPC function C<
semctl
>. You'll probably have to
say
first to get the correct constant definitions. If CMD is IPC_STAT or
GETALL, then ARG must be a variable that will hold the returned
semid_ds structure or semaphore value array. Returns like C<
ioctl
>:
the undefined value
for
error,
"C<0 but true>"
for
zero, or the actual
return
value otherwise. The ARG must consist of a vector of native
short integers, which may be created
with
C<
pack
(
"s!"
,(0)x
$nsem
)>.
See also L<perlipc/
"SysV IPC"
>, C<IPC::SysV>, C<IPC::Semaphore>
documentation.
=item
semget
KEY,NSEMS,FLAGS
Calls the System V IPC function
semget
. Returns the semaphore id, or
the undefined value
if
there is an error. See also
L<perlipc/
"SysV IPC"
>, C<IPC::SysV>, C<IPC::SysV::Semaphore>
documentation.
=item
semop
KEY,OPSTRING
Calls the System V IPC function
semop
to perform semaphore operations
such as signalling and waiting. OPSTRING must be a packed array of
semop
structures. Each
semop
structure can be generated
with
C<
pack
(
"s!3"
,
$semnum
,
$semop
,
$semflag
)>. The
length
of OPSTRING
implies the number of semaphore operations. Returns true
if
successful, or false
if
there is an error. As an example, the
following code waits on semaphore
$semnum
of semaphore id
$semid
:
$semop
=
pack
(
"s!3"
,
$semnum
, -1, 0);
die
"Semaphore trouble: $!\n"
unless
semop
(
$semid
,
$semop
);
To signal the semaphore, replace C<-1>
with
C<1>. See also
L<perlipc/
"SysV IPC"
>, C<IPC::SysV>, and C<IPC::SysV::Semaphore>
documentation.
=item
send
SOCKET,MSG,FLAGS,TO
=item
send
SOCKET,MSG,FLAGS
Sends a message on a
socket
. Attempts to
send
the
scalar
MSG to the
SOCKET filehandle. Takes the same flags as the
system
call of the
same name. On unconnected sockets you must specify a destination to
send
TO, in which case it does a C C<sendto>. Returns the number of
characters sent, or the undefined value
if
there is an error. The C
system
call sendmsg(2) is currently unimplemented. See
L<perlipc/
"UDP: Message Passing"
>
for
examples.
Note the I<characters>: depending on the status of the
socket
, either
(8-bit) bytes or characters are sent. By
default
all sockets operate
on bytes, but
for
example
if
the
socket
has
been changed using
binmode
() to operate
with
the C<:utf8> I/O layer (see L</
open
>, or the
C<
open
> pragma, L<
open
>), the I/O will operate on UTF-8 encoded
Unicode characters, not bytes. Similarly
for
the C<:encoding> pragma:
in that case pretty much any characters can be sent.
=item
setpgrp
PID,PGRP
Sets the current process group
for
the specified PID, C<0>
for
the current
process. Will produce a fatal error
if
used on a machine that doesn't
implement POSIX setpgid(2) or BSD
setpgrp
(2). If the arguments are omitted,
it defaults to C<0,0>. Note that the BSD 4.2 version of C<
setpgrp
> does not
accept
any arguments, so only C<
setpgrp
(0,0)> is portable. See also
C<POSIX::setsid()>.
=item
setpriority
WHICH,WHO,PRIORITY
Sets the current priority
for
a process, a process group, or a user.
(See
setpriority
(2).) Will produce a fatal error
if
used on a machine
that doesn't implement
setpriority
(2).
=item
setsockopt
SOCKET,LEVEL,OPTNAME,OPTVAL
Sets the
socket
option requested. Returns undefined
if
there is an
error. OPTVAL may be specified as C<
undef
>
if
you don't want to pass an
argument.
=item
shift
ARRAY
=item
shift
Shifts the first value of the array off and returns it, shortening the
array by 1 and moving everything down. If there are
no
elements in the
array, returns the undefined value. If ARRAY is omitted, shifts the
C<
@_
> array within the lexical scope of subroutines and formats, and the
C<
@ARGV
> array at file scopes or within the lexical scopes established by
the C<
eval
''
>, C<BEGIN {}>, C<INIT {}>, C<CHECK {}>, and C<END {}>
constructs.
See also C<
unshift
>, C<
push
>, and C<
pop
>. C<
shift
> and C<
unshift
>
do
the
same thing to the left end of an array that C<
pop
> and C<
push
>
do
to the
right end.
=item
shmctl
ID,CMD,ARG
Calls the System V IPC function
shmctl
. You'll probably have to
say
first to get the correct constant definitions. If CMD is C<IPC_STAT>,
then ARG must be a variable that will hold the returned C<shmid_ds>
structure. Returns like
ioctl
: the undefined value
for
error, "C<0> but
true"
for
zero, or the actual
return
value otherwise.
See also L<perlipc/
"SysV IPC"
> and C<IPC::SysV> documentation.
=item
shmget
KEY,SIZE,FLAGS
Calls the System V IPC function
shmget
. Returns the shared memory
segment id, or the undefined value
if
there is an error.
See also L<perlipc/
"SysV IPC"
> and C<IPC::SysV> documentation.
=item
shmread
ID,VAR,POS,SIZE
=item
shmwrite
ID,STRING,POS,SIZE
Reads or writes the System V shared memory segment ID starting at
position POS
for
size SIZE by attaching to it, copying in/out, and
detaching from it. When reading, VAR must be a variable that will
hold the data
read
. When writing,
if
STRING is too long, only SIZE
bytes are used;
if
STRING is too short, nulls are written to fill out
SIZE bytes. Return true
if
successful, or false
if
there is an error.
shmread
() taints the variable. See also L<perlipc/
"SysV IPC"
>,
C<IPC::SysV> documentation, and the C<IPC::Shareable> module from CPAN.
=item
shutdown
SOCKET,HOW
Shuts down a
socket
connection in the manner indicated by HOW, which
has
the same interpretation as in the
system
call of the same name.
shutdown
(SOCKET, 0);
shutdown
(SOCKET, 1);
shutdown
(SOCKET, 2);
This is useful
with
sockets
when
you want to
tell
the other
side you're done writing but not done reading, or vice versa.
It's also a more insistent form of
close
because it also
disables the file descriptor in any forked copies in other
processes.
=item
sin
EXPR
=item
sin
Returns the sine of EXPR (expressed in radians). If EXPR is omitted,
returns sine of C<
$_
>.
For the inverse sine operation, you may
use
the C<Math::Trig::asin>
function, or
use
this relation:
sub
asin {
atan2
(
$_
[0],
sqrt
(1 -
$_
[0] *
$_
[0])) }
=item
sleep
EXPR
=item
sleep
Causes the script to
sleep
for
EXPR seconds, or forever
if
no
EXPR.
May be interrupted
if
the process receives a signal such as C<SIGALRM>.
Returns the number of seconds actually slept. You probably cannot
mix C<
alarm
> and C<
sleep
> calls, because C<
sleep
> is often implemented
using C<
alarm
>.
On some older systems, it may
sleep
up to a full second less than what
you requested, depending on how it counts seconds. Most modern systems
always
sleep
the full amount. They may appear to
sleep
longer than that,
however, because your process might not be scheduled right away in a
busy multitasking
system
.
For delays of finer granularity than one second, you may
use
Perl's
C<
syscall
> interface to access setitimer(2)
if
your
system
supports
it, or
else
see L</
select
> above. The Time::HiRes module (from CPAN,
and starting from Perl 5.8 part of the standard distribution) may also
help.
See also the POSIX module's C<pause> function.
=item
socket
SOCKET,DOMAIN,TYPE,PROTOCOL
Opens a
socket
of the specified kind and attaches it to filehandle
SOCKET. DOMAIN, TYPE, and PROTOCOL are specified the same as
for
the
system
call of the same name. You should C<
use
Socket> first
to get the proper definitions imported. See the examples in
L<perlipc/
"Sockets: Client/Server Communication"
>.
On systems that support a
close
-on-
exec
flag on files, the flag will
be set
for
the newly opened file descriptor, as determined by the
value of $^F. See L<perlvar/$^F>.
=item
socketpair
SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL
Creates an unnamed pair of sockets in the specified domain, of the
specified type. DOMAIN, TYPE, and PROTOCOL are specified the same as
for
the
system
call of the same name. If unimplemented, yields a fatal
error. Returns true
if
successful.
On systems that support a
close
-on-
exec
flag on files, the flag will
be set
for
the newly opened file descriptors, as determined by the value
of $^F. See L<perlvar/$^F>.
Some systems
defined
C<
pipe
> in terms of C<
socketpair
>, in which a call
to C<
pipe
(Rdr, Wtr)> is essentially:
socketpair
(Rdr, Wtr, AF_UNIX, SOCK_STREAM, PF_UNSPEC);
shutdown
(Rdr, 1);
shutdown
(Wtr, 0);
See L<perlipc>
for
an example of
socketpair
use
. Perl 5.8 and later will
emulate
socketpair
using IP sockets to localhost
if
your
system
implements
sockets but not
socketpair
.
=item
sort
SUBNAME LIST
=item
sort
BLOCK LIST
=item
sort
LIST
In list context, this sorts the LIST and returns the sorted list value.
In
scalar
context, the behaviour of C<
sort
()> is undefined.
If SUBNAME or BLOCK is omitted, C<
sort
>s in standard string comparison
order. If SUBNAME is specified, it gives the name of a subroutine
that returns an integer less than, equal to, or greater than C<0>,
depending on how the elements of the list are to be ordered. (The C<<
<=> >> and C<cmp> operators are extremely useful in such routines.)
SUBNAME may be a
scalar
variable name (unsubscripted), in which case
the value provides the name of (or a reference to) the actual
subroutine to
use
. In place of a SUBNAME, you can provide a BLOCK as
an anonymous, in-line
sort
subroutine.
If the subroutine's
prototype
is C<($$)>, the elements to be compared
are passed by reference in C<
@_
>, as
for
a normal subroutine. This is
slower than unprototyped subroutines, where the elements to be
compared are passed into the subroutine
as the
package
global variables
$a
and
$b
(see example below). Note that
in the latter case, it is usually counter-productive to declare
$a
and
$b
as lexicals.
In either case, the subroutine may not be recursive. The
values
to be
compared are always passed by reference and should not be modified.
You also cannot
exit
out of the
sort
block or subroutine using any of the
loop control operators described in L<perlsyn> or
with
C<
goto
>.
When C<
use
locale> is in effect, C<
sort
LIST> sorts LIST according to the
current collation locale. See L<perllocale>.
sort
() returns aliases into the original list, much as a
for
loop's
index
variable aliases the list elements. That is, modifying an element of a
list returned by
sort
() (
for
example, in a C<
foreach
>, C<
map
> or C<
grep
>)
actually modifies the element in the original list. This is usually
something to be avoided
when
writing clear code.
Perl 5.6 and earlier used a quicksort algorithm to implement
sort
.
That algorithm was not stable, and I<could> go quadratic. (A I<stable>
sort
preserves the input order of elements that compare equal. Although
quicksort's run
time
is O(NlogN)
when
averaged over all arrays of
length
N, the
time
can be O(N**2), I<quadratic> behavior,
for
some
inputs.) In 5.7, the quicksort implementation was replaced
with
a stable mergesort algorithm whose worst-case behavior is O(NlogN).
But benchmarks indicated that
for
some inputs, on some platforms,
the original quicksort was faster. 5.8
has
a
sort
pragma
for
limited control of the
sort
. Its rather blunt control of the
underlying algorithm may not persist into future Perls, but the
ability to characterize the input or output in implementation
independent ways quite probably will. See L<
sort
>.
Examples:
@articles
=
sort
@files
;
@articles
=
sort
{
$a
cmp
$b
}
@files
;
@articles
=
sort
{
uc
(
$a
) cmp
uc
(
$b
)}
@files
;
@articles
=
sort
{
$b
cmp
$a
}
@files
;
@articles
=
sort
{
$a
<=>
$b
}
@files
;
@articles
=
sort
{
$b
<=>
$a
}
@files
;
@eldest
=
sort
{
$age
{
$b
} <=>
$age
{
$a
} }
keys
%age
;
sub
byage {
$age
{
$a
} <=>
$age
{
$b
};
}
@sortedclass
=
sort
byage
@class
;
sub
backwards {
$b
cmp
$a
}
@harry
=
qw(dog cat x Cain Abel)
;
@george
=
qw(gone chased yz Punished Axed)
;
print
sort
@harry
;
print
sort
backwards
@harry
;
print
sort
@george
,
'to'
,
@harry
;
@new
=
sort
{
(
$b
=~ /=(\d+)/)[0] <=> (
$a
=~ /=(\d+)/)[0]
||
uc
(
$a
) cmp
uc
(
$b
)
}
@old
;
@nums
=
@caps
= ();
for
(
@old
) {
push
@nums
, /=(\d+)/;
push
@caps
,
uc
(
$_
);
}
@new
=
@old
[
sort
{
$nums
[
$b
] <=>
$nums
[
$a
]
||
$caps
[
$a
] cmp
$caps
[
$b
]
} 0..
$#old
];
@new
=
map
{
$_
->[0] }
sort
{
$b
->[1] <=>
$a
->[1]
||
$a
->[2] cmp
$b
->[2]
}
map
{ [
$_
, /=(\d+)/,
uc
(
$_
)] }
@old
;
sub
backwards ($$) {
$_
[1] cmp
$_
[0]; }
@new
=
sort
other::backwards
@old
;
use
sort
'stable'
;
@new
=
sort
{
substr
(
$a
, 3, 5) cmp
substr
(
$b
, 3, 5) }
@old
;
use
sort
'_mergesort'
;
@new
=
sort
{
substr
(
$a
, 3, 5) cmp
substr
(
$b
, 3, 5) }
@old
;
If you're using strict, you I<must not> declare
$a
and
$b
as lexicals. They are
package
globals. That means
if
you're in the C<main>
package
and type
@articles
=
sort
{
$b
<=>
$a
}
@files
;
then C<
$a
> and C<
$b
> are C<
$main::a
> and C<
$main::b
> (or C<$::a> and C<$::b>),
but
if
you
're in the C<FooPack> package, it'
s the same as typing
@articles
=
sort
{
$FooPack::b
<=>
$FooPack::a
}
@files
;
The comparison function is required to behave. If it returns
inconsistent results (sometimes saying C<
$x
[1]> is less than C<
$x
[2]> and
sometimes saying the opposite,
for
example) the results are not
well-
defined
.
Because C<< <=> >> returns C<
undef
>
when
either operand is C<NaN>
(not-a-number), and because C<
sort
> will trigger a fatal error
unless
the
result of a comparison is
defined
,
when
sorting
with
a comparison function
like C<<
$a
<=>
$b
>>, be careful about lists that might contain a C<NaN>.
The following example takes advantage of the fact that C<NaN != NaN> to
eliminate any C<NaN>s from the input.
@result
=
sort
{
$a
<=>
$b
}
grep
{
$_
==
$_
}
@input
;
=item
splice
ARRAY,OFFSET,LENGTH,LIST
=item
splice
ARRAY,OFFSET,LENGTH
=item
splice
ARRAY,OFFSET
=item
splice
ARRAY
Removes the elements designated by OFFSET and LENGTH from an array, and
replaces them
with
the elements of LIST,
if
any. In list context,
returns the elements removed from the array. In
scalar
context,
returns the
last
element removed, or C<
undef
>
if
no
elements are
removed. The array grows or shrinks as necessary.
If OFFSET is negative then it starts that far from the end of the array.
If LENGTH is omitted, removes everything from OFFSET onward.
If LENGTH is negative, removes the elements from OFFSET onward
except
for
-LENGTH elements at the end of the array.
If both OFFSET and LENGTH are omitted, removes everything. If OFFSET is
past the end of the array, perl issues a warning, and splices at the
end of the array.
The following equivalences hold (assuming C<< $[ == 0 and
$#a
>=
$i
>> )
push
(
@a
,
$x
,
$y
)
splice
(
@a
,
@a
,0,
$x
,
$y
)
pop
(
@a
)
splice
(
@a
,-1)
shift
(
@a
)
splice
(
@a
,0,1)
unshift
(
@a
,
$x
,
$y
)
splice
(
@a
,0,0,
$x
,
$y
)
$a
[
$i
] =
$y
splice
(
@a
,
$i
,1,
$y
)
Example, assuming array lengths are passed
before
arrays:
sub
aeq {
my
(
@a
) =
splice
(
@_
,0,
shift
);
my
(
@b
) =
splice
(
@_
,0,
shift
);
return
0
unless
@a
==
@b
;
while
(
@a
) {
return
0
if
pop
(
@a
) ne
pop
(
@b
);
}
return
1;
}
if
(
&aeq
(
$len
,
@foo
[1..
$len
],0+
@bar
,
@bar
)) { ... }
=item
split
/PATTERN/,EXPR,LIMIT
=item
split
/PATTERN/,EXPR
=item
split
/PATTERN/
=item
split
Splits the string EXPR into a list of strings and returns that list. By
default
, empty leading fields are preserved, and empty trailing ones are
deleted. (If all fields are empty, they are considered to be trailing.)
In
scalar
context, returns the number of fields found and splits into
the C<
@_
> array. Use of
split
in
scalar
context is deprecated, however,
because it clobbers your subroutine arguments.
If EXPR is omitted, splits the C<
$_
> string. If PATTERN is also omitted,
splits on whitespace (
after
skipping any leading whitespace). Anything
matching PATTERN is taken to be a delimiter separating the fields. (Note
that the delimiter may be longer than one character.)
If LIMIT is specified and positive, it represents the maximum number
of fields the EXPR will be
split
into, though the actual number of
fields returned depends on the number of
times
PATTERN matches within
EXPR. If LIMIT is unspecified or zero, trailing null fields are
stripped (which potential users of C<
pop
> would
do
well to remember).
If LIMIT is negative, it is treated as
if
an arbitrarily large LIMIT
had been specified. Note that splitting an EXPR that evaluates to the
empty string always returns the empty list, regardless of the LIMIT
specified.
A pattern matching the null string (not to be confused
with
a null pattern C<//>, which is just one member of the set of patterns
matching a null string) will
split
the value of EXPR into separate
characters at
each
point it matches that way. For example:
print
join
(
':'
,
split
(/ */,
'hi there'
));
produces the output
'h:i:t:h:e:r:e'
.
As a special case
for
C<
split
>, using the empty pattern C<//> specifically
matches only the null string, and is not be confused
with
the regular
use
of C<//> to mean
"the last successful pattern match"
. So,
for
C<
split
>,
the following:
print
join
(
':'
,
split
(//,
'hi there'
));
produces the output
'h:i: :t:h:e:r:e'
.
Empty leading (or trailing) fields are produced
when
there are positive
width matches at the beginning (or end) of the string; a zero-width match
at the beginning (or end) of the string does not produce an empty field.
For example:
print
join
(
':'
,
split
(/(?=\w)/,
'hi there!'
));
produces the output
'h:i :t:h:e:r:e!'
.
The LIMIT parameter can be used to
split
a line partially
(
$login
,
$passwd
,
$remainder
) =
split
(/:/,
$_
, 3);
When assigning to a list,
if
LIMIT is omitted, or zero, Perl supplies
a LIMIT one larger than the number of variables in the list, to avoid
unnecessary work. For the list above LIMIT would have been 4 by
default
. In
time
critical applications it behooves you not to
split
into more fields than you really need.
If the PATTERN contains parentheses, additional list elements are
created from
each
matching substring in the delimiter.
split
(/([,-])/,
"1-10,20"
, 3);
produces the list value
(1,
'-'
, 10,
','
, 20)
If you had the entire header of a normal Unix email message in
$header
,
you could
split
it up into fields and their
values
this way:
$header
=~ s/\n\s+/ /g;
%hdrs
= (
UNIX_FROM
=>
split
/^(\S*?):\s*/m,
$header
);
The pattern C</PATTERN/> may be replaced
with
an expression to specify
patterns that vary at runtime. (To
do
runtime compilation only once,
As a special case, specifying a PATTERN of space (S<C<
' '
>>) will
split
on
white space just as C<
split
>
with
no
arguments does. Thus, S<C<
split
(
' '
)>> can
be used to emulate B<awk>'s
default
behavior, whereas S<C<
split
(/ /)>>
will give you as many null initial fields as there are leading spaces.
A C<
split
> on C</\s+/> is like a S<C<
split
(
' '
)>> except that any leading
whitespace produces a null first field. A C<
split
>
with
no
arguments
really does a S<C<
split
(
' '
,
$_
)>> internally.
A PATTERN of C</^/> is treated as
if
it were C</^/m>, since it isn't
Example:
open
(PASSWD,
'/etc/passwd'
);
while
(<PASSWD>) {
chomp
;
(
$login
,
$passwd
,
$uid
,
$gid
,
$gcos
,
$home
,
$shell
) =
split
(/:/);
}
As
with
regular pattern matching, any capturing parentheses that are not
matched in a C<
split
()> will be set to C<
undef
>
when
returned:
@fields
=
split
/(A)|B/,
"1A2B3"
;
=item
sprintf
FORMAT, LIST
Returns a string formatted by the usual C<
printf
> conventions of the C
library function C<
sprintf
>. See below
for
more details
and see L<
sprintf
(3)> or L<
printf
(3)> on your
system
for
an explanation of
the general principles.
For example:
$result
=
sprintf
(
"%08d"
,
$number
);
$rounded
=
sprintf
(
"%.3f"
,
$number
);
Perl does its own C<
sprintf
> formatting--it emulates the C
function C<
sprintf
>, but it doesn't
use
it (except
for
floating-point
numbers, and even then only the standard modifiers are allowed). As a
result, any non-standard extensions in your
local
C<
sprintf
> are not
available from Perl.
Unlike C<
printf
>, C<
sprintf
> does not
do
what you probably mean
when
you
pass it an array as your first argument. The array is
given
scalar
context,
and instead of using the 0th element of the array as the
format
, Perl will
use
the count of elements in the array as the
format
, which is almost never
useful.
Perl's C<
sprintf
> permits the following universally-known conversions:
%% a percent sign
%c
a character
with
the
given
number
%s
a string
%d
a signed integer, in decimal
%u
an unsigned integer, in decimal
%o
an unsigned integer, in octal
%x
an unsigned integer, in hexadecimal
%e
a floating-point number, in scientific notation
%f
a floating-point number, in fixed decimal notation
%g
a floating-point number, in
%e
or
%f
notation
In addition, Perl permits the following widely-supported conversions:
%X
like
%x
, but using upper-case letters
%E
like
%e
, but using an upper-case
"E"
%G
like
%g
, but
with
an upper-case
"E"
(
if
applicable)
%b
an unsigned integer, in binary
%p
a pointer (outputs the Perl value's address in hexadecimal)
%n
special:
*stores
* the number of characters output so far
into the
next
variable in the parameter list
Finally,
for
backward (and we
do
mean
"backward"
) compatibility, Perl
permits these unnecessary but widely-supported conversions:
%i
a synonym
for
%d
%D
a synonym
for
%ld
%U
a synonym
for
%lu
%O
a synonym
for
%lo
%F
a synonym
for
%f
Note that the number of exponent digits in the scientific notation produced
by C<
%e
>, C<
%E
>, C<
%g
> and C<
%G
>
for
numbers
with
the modulus of the
exponent less than 100 is
system
-dependent: it may be three or less
(zero-padded as necessary). In other words, 1.23
times
ten to the
99th may be either
"1.23e99"
or
"1.23e099"
.
Between the C<%> and the
format
letter, you may specify a number of
additional attributes controlling the interpretation of the
format
.
In order, these are:
=over 4
=item
format
parameter
index
An explicit
format
parameter
index
, such as C<2$>. By
default
sprintf
will
format
the
next
unused argument in the list, but this allows you
to take the arguments out of order, e.g.:
printf
'%2$d %1$d'
, 12, 34;
printf
'%3$d %d %1$d'
, 1, 2, 3;
=item flags
one or more of:
space prefix positive number
with
a space
+ prefix positive number
with
a plus sign
- left-justify within the field
0
use
zeros, not spaces, to right-justify
non-zero binary
with
"0b"
For example:
printf
'<% d>'
, 12;
printf
'<%+d>'
, 12;
printf
'<%6s>'
, 12;
printf
'<%-6s>'
, 12;
printf
'<%06s>'
, 12;
printf
'<%#x>'
, 12; # prints
"<0xc>"
=item vector flag
The vector flag C<v>, optionally specifying the
join
string to
use
.
This flag tells perl to interpret the supplied string as a vector
of integers, one
for
each
character in the string, separated by
a
given
string (a dot C<.> by
default
). This can be useful
for
displaying ordinal
values
of characters in arbitrary strings:
printf
"version is v%vd\n"
, $^V;
Put an asterisk C<*>
before
the C<v> to
override
the string to
use
to separate the numbers:
printf
"address is %*vX\n"
,
":"
,
$addr
;
printf
"bits are %0*v8b\n"
,
" "
,
$bits
;
You can also explicitly specify the argument number to
use
for
the
join
string using e.g. C<*2
$v
>:
printf
'%*4$vX %*4$vX %*4$vX'
,
@addr
[1..3],
":"
;
=item (minimum) width
Arguments are usually formatted to be only as wide as required to
display the
given
value. You can
override
the width by putting
a number here, or get the width from the
next
argument (
with
C<*>)
or from a specified argument (
with
e.g. C<*2$>):
printf
'<%s>'
,
"a"
;
printf
'<%6s>'
,
"a"
;
printf
'<%*s>'
, 6,
"a"
;
printf
'<%*2$s>'
,
"a"
, 6;
printf
'<%2s>'
,
"long"
;
If a field width obtained through C<*> is negative, it
has
the same
effect as the C<-> flag: left-justification.
=item precision, or maximum width
You can specify a precision (
for
numeric conversions) or a maximum
width (
for
string conversions) by specifying a C<.> followed by a number.
For floating point formats,
with
the exception of
'g'
and
'G'
, this specifies
the number of decimal places to show (the
default
being 6), e.g.:
printf
'<%f>'
, 1;
printf
'<%.1f>'
, 1;
printf
'<%.0f>'
, 1;
printf
'<%e>'
, 10;
printf
'<%.1e>'
, 10;
For
'g'
and
'G'
, this specifies the maximum number of digits to show,
including prior to the decimal point as well as
after
it, e.g.:
printf
'<%g>'
, 1;
printf
'<%.10g>'
, 1;
printf
'<%g>'
, 100;
printf
'<%.1g>'
, 100;
printf
'<%.2g>'
, 100.01;
printf
'<%.5g>'
, 100.01;
printf
'<%.4g>'
, 100.01;
For integer conversions, specifying a precision implies that the
output of the number itself should be zero-padded to this width:
printf
'<%.6x>'
, 1;
printf
'<%#.6x>'
, 1; # prints
"<0x000001>"
printf
'<%-10.6x>'
, 1;
For string conversions, specifying a precision truncates the string
to fit in the specified width:
printf
'<%.5s>'
,
"truncated"
;
printf
'<%10.5s>'
,
"truncated"
;
You can also get the precision from the
next
argument using C<.*>:
printf
'<%.6x>'
, 1;
printf
'<%.*x>'
, 6, 1;
You cannot currently get the precision from a specified number,
but it is intended that this will be possible in the future using
e.g. C<.*2$>:
printf
'<%.*2$x>'
, 1, 6;
=item size
For numeric conversions, you can specify the size to interpret the
number as using C<l>, C<h>, C<V>, C<
q>, C&
lt;L>, or C<ll>. For integer
conversions (C<d u o x X b i D U O>), numbers are usually assumed to be
whatever the
default
integer size is on your platform (usually 32 or 64
bits), but you can
override
this to
use
instead one of the standard C types,
as supported by the compiler used to build Perl:
l interpret integer as C type
"long"
or
"unsigned long"
h interpret integer as C type
"short"
or
"unsigned short"
q, L or ll interpret integer as C type "long long",
"unsigned long long"
.
or
"quads"
(typically 64-bit integers)
The
last
will produce errors
if
Perl does not understand
"quads"
in your
installation. (This requires that either the platform natively supports quads
or Perl was specifically compiled to support quads.) You can find out
whether your Perl supports quads via L<Config>:
(
$Config
{use64bitint} eq
'define'
||
$Config
{longsize} >= 8) &&
print
"quads\n"
;
For floating point conversions (C<e f g E F G>), numbers are usually assumed
to be the
default
floating point size on your platform (double or long double),
but you can force
'long double'
with
C<
q>, C&
lt;L>, or C<ll>
if
your
platform supports them. You can find out whether your Perl supports long
doubles via L<Config>:
$Config
{d_longdbl} eq
'define'
&&
print
"long doubles\n"
;
You can find out whether Perl considers
'long double'
to be the
default
floating point size to
use
on your platform via L<Config>:
(
$Config
{uselongdouble} eq
'define'
) &&
print
"long doubles by default\n"
;
It can also be the case that long doubles and doubles are the same thing:
(
$Config
{doublesize} ==
$Config
{longdblsize}) &&
print
"doubles are long doubles\n"
;
The size specifier C<V>
has
no
effect
for
Perl code, but it is supported
for
compatibility
with
XS code; it means '
use
the standard size
for
a Perl integer (or floating-point number)', which is already the
default
for
Perl code.
=item order of arguments
Normally,
sprintf
takes the
next
unused argument as the value to
format
for
each
format
specification. If the
format
specification
uses C<*> to
require
additional arguments, these are consumed from
the argument list in the order in which they appear in the
format
specification I<
before
> the value to
format
. Where an argument is
specified using an explicit
index
, this does not affect the normal
order
for
the arguments (even
when
the explicitly specified
index
would have been the
next
argument in any case).
So:
printf
'<%*.*s>'
,
$a
,
$b
,
$c
;
would
use
C<
$a
>
for
the width, C<
$b
>
for
the precision and C<
$c
>
as the value to
format
,
while
:
print
'<%*1$.*s>'
,
$a
,
$b
;
would
use
C<
$a
>
for
the width and the precision, and C<
$b
> as the
value to
format
.
Here are some more examples - beware that
when
using an explicit
index
, the C<$> may need to be escaped:
printf
"%2\$d %d\n"
, 12, 34;
printf
"%2\$d %d %d\n"
, 12, 34;
printf
"%3\$d %d %d\n"
, 12, 34, 56;
printf
"%2\$*3\$d %d\n"
, 12, 34, 3;
=back
If C<
use
locale> is in effect, the character used
for
the decimal
point in formatted real numbers is affected by the LC_NUMERIC locale.
See L<perllocale>.
=item
sqrt
EXPR
=item
sqrt
Return the square root of EXPR. If EXPR is omitted, returns square
root of C<
$_
>. Only works on non-negative operands,
unless
you've
loaded the standard Math::Complex module.
print
sqrt
(-2);
=item
srand
EXPR
=item
srand
Sets the random number seed
for
the C<
rand
> operator.
The point of the function is to
"seed"
the C<
rand
> function so that
C<
rand
> can produce a different sequence
each
time
you run your
program.
If
srand
() is not called explicitly, it is called implicitly at the
first
use
of the C<
rand
> operator. However, this was not the case in
versions of Perl
before
5.004, so
if
your script will run under older
Perl versions, it should call C<
srand
>.
Most programs won't even call
srand
() at all, except those that
need a cryptographically-strong starting point rather than the
generally acceptable
default
, which is based on
time
of day,
process ID, and memory allocation, or the F</dev/urandom> device,
if
available.
You can call
srand
(
$seed
)
with
the same
$seed
to reproduce the
I<same> sequence from
rand
(), but this is usually reserved
for
generating predictable results
for
testing or debugging.
Otherwise, don't call
srand
() more than once in your program.
Do B<not> call
srand
() (i.e. without an argument) more than once in
a script. The internal state of the random number generator should
contain more entropy than can be provided by any seed, so calling
srand
() again actually I<loses> randomness.
Most implementations of C<
srand
> take an integer and will silently
truncate
decimal numbers. This means C<
srand
(42)> will usually
produce the same results as C<
srand
(42.1)>. To be safe, always pass
C<
srand
> an integer.
In versions of Perl prior to 5.004 the
default
seed was just the
current C<
time
>. This isn't a particularly good seed, so many old
programs supply their own seed value (often C<
time
^ $$> or C<
time
^
($$ + ($$ << 15))>), but that isn't necessary any more.
For cryptographic purposes, however, you need something much more random
than the
default
seed. Checksumming the compressed output of one or more
rapidly changing operating
system
status programs is the usual method. For
example:
srand
(
time
^ $$ ^
unpack
"%L*"
, `ps axww | gzip`);
If you're particularly concerned
with
this, see the C<Math::TrulyRandom>
module in CPAN.
Frequently called programs (like CGI scripts) that simply
use
time
^ $$
for
a seed can fall prey to the mathematical property that
a^b == (a+1)^(b+1)
one-third of the
time
. So don't
do
that.
=item
stat
FILEHANDLE
=item
stat
EXPR
=item
stat
Returns a 13-element list giving the status info
for
a file, either
the file opened via FILEHANDLE, or named by EXPR. If EXPR is omitted,
it stats C<
$_
>. Returns a null list
if
the
stat
fails. Typically used
as follows:
(
$dev
,
$ino
,
$mode
,
$nlink
,
$uid
,
$gid
,
$rdev
,
$size
,
$atime
,
$mtime
,
$ctime
,
$blksize
,
$blocks
)
=
stat
(
$filename
);
Not all fields are supported on all filesystem types. Here are the
meanings of the fields:
0 dev device number of filesystem
1 ino inode number
2 mode file mode (type and permissions)
3 nlink number of (hard) links to the file
4 uid numeric user ID of file's owner
5 gid numeric group ID of file's owner
6 rdev the device identifier (special files only)
7 size total size of file, in bytes
8 atime
last
access
time
in seconds since the epoch
9 mtime
last
modify
time
in seconds since the epoch
10 ctime inode change
time
in seconds since the epoch (*)
11 blksize preferred block size
for
file
system
I/O
12 blocks actual number of blocks allocated
(The epoch was at 00:00 January 1, 1970 GMT.)
(*) Not all fields are supported on all filesystem types. Notably, the
ctime field is non-portable. In particular, you cannot expect it to be a
"creation time"
, see L<perlport/
"Files and Filesystems"
>
for
details.
If C<
stat
> is passed the special filehandle consisting of an underline,
no
stat
is done, but the current contents of the
stat
structure from the
last
C<
stat
>, C<
lstat
>, or filetest are returned. Example:
if
(-x
$file
&& ((
$d
) =
stat
(_)) &&
$d
< 0) {
print
"$file is executable NFS file\n"
;
}
(This works on machines only
for
which the device number is negative
under NFS.)
Because the mode contains both the file type and its permissions, you
should mask off the file type portion and (s)
printf
using a C<
"%o"
>
if
you want to see the real permissions.
$mode
= (
stat
(
$filename
))[2];
printf
"Permissions are %04o\n"
,
$mode
& 07777;
In
scalar
context, C<
stat
> returns a boolean value indicating success
or failure, and,
if
successful, sets the information associated
with
the special filehandle C<_>.
The File::
stat
module provides a convenient, by-name access mechanism:
$sb
=
stat
(
$filename
);
printf
"File is %s, size is %s, perm %04o, mtime %s\n"
,
$filename
,
$sb
->size,
$sb
->mode & 07777,
scalar
localtime
$sb
->mtime;
You can
import
symbolic mode constants (C<S_IF*>) and functions
(C<S_IS*>) from the Fcntl module:
$mode
= (
stat
(
$filename
))[2];
$user_rwx
= (
$mode
& S_IRWXU) >> 6;
$group_read
= (
$mode
& S_IRGRP) >> 3;
$other_execute
=
$mode
& S_IXOTH;
printf
"Permissions are %04o\n"
, S_IMODE(
$mode
),
"\n"
;
$is_setuid
=
$mode
& S_ISUID;
$is_setgid
= S_ISDIR(
$mode
);
You could
write
the
last
two using the C<-u> and C<-d> operators.
The commonly available C<S_IF*> constants are
S_IRWXU S_IRUSR S_IWUSR S_IXUSR
S_IRWXG S_IRGRP S_IWGRP S_IXGRP
S_IRWXO S_IROTH S_IWOTH S_IXOTH
S_ISUID S_ISGID S_ISVTX S_ISTXT
S_IFREG S_IFDIR S_IFLNK S_IFBLK S_ISCHR S_IFIFO S_IFSOCK S_IFWHT S_ENFMT
S_IREAD S_IWRITE S_IEXEC
and the C<S_IF*> functions are
S_IMODE(
$mode
) the part of
$mode
containing the permission bits
and the setuid/setgid/sticky bits
S_IFMT(
$mode
) the part of
$mode
containing the file type
which can be bit-anded
with
e.g. S_IFREG
or
with
the following functions
S_ISREG(
$mode
) S_ISDIR(
$mode
) S_ISLNK(
$mode
)
S_ISBLK(
$mode
) S_ISCHR(
$mode
) S_ISFIFO(
$mode
) S_ISSOCK(
$mode
)
S_ISENFMT(
$mode
) S_ISWHT(
$mode
)
See your native
chmod
(2) and
stat
(2) documentation
for
more details
about the C<S_*> constants. To get status info
for
a symbolic
link
instead of the target file behind the
link
,
use
the C<
lstat
> function.
=item
study
SCALAR
=item
study
Takes extra
time
to
study
SCALAR (C<
$_
>
if
unspecified) in anticipation of
doing many pattern matches on the string
before
it is
next
modified.
This may or may not save
time
, depending on the nature and number of
patterns you are searching on, and on the distribution of character
frequencies in the string to be searched--you probably want to compare
run
times
with
and without it to see which runs faster. Those loops
that scan
for
many short constant strings (including the constant
parts of more complex patterns) will benefit most. You may have only
one C<
study
> active at a
time
--
if
you
study
a different
scalar
the first
is
"unstudied"
. (The way C<
study
> works is this: a linked list of every
character in the string to be searched is made, so we know,
for
example, where all the C<
'k'
> characters are. From
each
search string,
the rarest character is selected, based on some static frequency tables
constructed from some C programs and English text. Only those places
that contain this
"rarest"
character are examined.)
For example, here is a loop that inserts
index
producing entries
before
any line containing a certain pattern:
while
(<>) {
study
;
print
".IX foo\n"
if
/\bfoo\b/;
print
".IX bar\n"
if
/\bbar\b/;
print
".IX blurfl\n"
if
/\bblurfl\b/;
print
;
}
In searching
for
C</\bfoo\b/>, only those locations in C<
$_
> that contain C<f>
will be looked at, because C<f> is rarer than C<o>. In general, this is
a big win except in pathological cases. The only question is whether
it saves you more
time
than it took to build the linked list in the
first place.
Note that
if
you have to look
for
strings that you don't know till
runtime, you can build an entire loop as a string and C<
eval
> that to
avoid recompiling all your patterns all the
time
. Together
with
undefining C<$/> to input entire files as one record, this can be very
fast, often faster than specialized programs like fgrep(1). The following
scans a list of files (C<
@files
>)
for
a list of words (C<
@words
>), and prints
out the names of those files that contain a match:
$search
=
'while (<>) { study;'
;
foreach
$word
(
@words
) {
$search
.=
"++\$seen{\$ARGV} if /\\b$word\\b/;\n"
;
}
$search
.=
"}"
;
@ARGV
=
@files
;
undef
$/;
eval
$search
;
$/ =
"\n"
;
foreach
$file
(
sort
keys
(
%seen
)) {
print
$file
,
"\n"
;
}
=item
sub
NAME BLOCK
=item
sub
NAME (PROTO) BLOCK
=item
sub
NAME : ATTRS BLOCK
=item
sub
NAME (PROTO) : ATTRS BLOCK
This is subroutine definition, not a real function I<per se>.
Without a BLOCK it's just a forward declaration. Without a NAME,
it's an anonymous function declaration, and does actually
return
a value: the CODE
ref
of the closure you just created.
See L<perlsub> and L<perlref>
for
details about subroutines and
references, and L<attributes> and L<Attribute::Handlers>
for
more
information about attributes.
=item
substr
EXPR,OFFSET,LENGTH,REPLACEMENT
=item
substr
EXPR,OFFSET,LENGTH
=item
substr
EXPR,OFFSET
Extracts a substring out of EXPR and returns it. First character is at
offset C<0>, or whatever you
've set C<$[> to (but don'
t
do
that).
If OFFSET is negative (or more precisely, less than C<$[>), starts
that far from the end of the string. If LENGTH is omitted, returns
everything to the end of the string. If LENGTH is negative, leaves that
many characters off the end of the string.
You can
use
the
substr
() function as an lvalue, in which case EXPR
must itself be an lvalue. If you assign something shorter than LENGTH,
the string will shrink, and
if
you assign something longer than LENGTH,
the string will grow to accommodate it. To keep the string the same
length
you may need to pad or
chop
your value using C<
sprintf
>.
If OFFSET and LENGTH specify a substring that is partly outside the
string, only the part within the string is returned. If the substring
is beyond either end of the string,
substr
() returns the undefined
value and produces a warning. When used as an lvalue, specifying a
substring that is entirely outside the string is a fatal error.
Here's an example showing the behavior
for
boundary cases:
my
$name
=
'fred'
;
substr
(
$name
, 4) =
'dy'
;
my
$null
=
substr
$name
, 6, 2;
my
$oops
=
substr
$name
, 7;
substr
(
$name
, 7) =
'gap'
;
An alternative to using
substr
() as an lvalue is to specify the
replacement string as the 4th argument. This allows you to replace
parts of the EXPR and
return
what was there
before
in one operation,
just as you can
with
splice
().
=item
symlink
OLDFILE,NEWFILE
Creates a new filename symbolically linked to the old filename.
Returns C<1>
for
success, C<0> otherwise. On systems that don't support
symbolic links, produces a fatal error at run
time
. To check
for
that,
use
eval
:
$symlink_exists
=
eval
{
symlink
(
""
,
""
); 1 };
=item
syscall
NUMBER, LIST
Calls the
system
call specified as the first element of the list,
passing the remaining elements as arguments to the
system
call. If
unimplemented, produces a fatal error. The arguments are interpreted
as follows:
if
a
given
argument is numeric, the argument is passed as
an
int
. If not, the pointer to the string value is passed. You are
responsible to make sure a string is pre-extended long enough to
receive any result that might be written into a string. You can't
use
a
string literal (or other
read
-only string) as an argument to C<
syscall
>
because Perl
has
to assume that any string pointer might be written
through. If your
integer arguments are not literals and have never been interpreted in a
numeric context, you may need to add C<0> to them to force them to look
like numbers. This emulates the C<
syswrite
> function (or vice versa):
require
'syscall.ph'
;
$s
=
"hi there\n"
;
syscall
(
&SYS_write
,
fileno
(STDOUT),
$s
,
length
$s
);
Note that Perl supports passing of up to only 14 arguments to your
system
call,
which in practice should usually suffice.
Syscall returns whatever value returned by the
system
call it calls.
If the
system
call fails, C<
syscall
> returns C<-1> and sets C<$!> (errno).
Note that some
system
calls can legitimately
return
C<-1>. The proper
way to handle such calls is to assign C<$!=0;>
before
the call and
check the value of C<$!>
if
syscall
returns C<-1>.
There's a problem
with
C<
syscall
(
&SYS_pipe
)>: it returns the file
number of the
read
end of the
pipe
it creates. There is
no
way
to retrieve the file number of the other end. You can avoid this
problem by using C<
pipe
> instead.
=item
sysopen
FILEHANDLE,FILENAME,MODE
=item
sysopen
FILEHANDLE,FILENAME,MODE,PERMS
Opens the file whose filename is
given
by FILENAME, and associates it
with
FILEHANDLE. If FILEHANDLE is an expression, its value is used as
the name of the real filehandle wanted. This function calls the
underlying operating
system
's C<
open
> function
with
the parameters
FILENAME, MODE, PERMS.
The possible
values
and flag bits of the MODE parameter are
system
-dependent; they are available via the standard module C<Fcntl>.
See the documentation of your operating
system
's C<
open
> to see which
values
and flag bits are available. You may combine several flags
using the C<|>-operator.
Some of the most common
values
are C<O_RDONLY>
for
opening the file in
read
-only mode, C<O_WRONLY>
for
opening the file in
write
-only mode,
and C<O_RDWR>
for
opening the file in
read
-
write
mode.
For historical reasons, some
values
work on almost every
system
supported by perl: zero means
read
-only, one means
write
-only, and two
means
read
/
write
. We know that these
values
do
I<not> work under
OS/390 & VM/ESA Unix and on the Macintosh; you probably don't want to
If the file named by FILENAME does not exist and the C<
open
> call creates
it (typically because MODE includes the C<O_CREAT> flag), then the value of
PERMS specifies the permissions of the newly created file. If you omit
the PERMS argument to C<
sysopen
>, Perl uses the octal value C<0666>.
These permission
values
need to be in octal, and are modified by your
process's current C<
umask
>.
In many systems the C<O_EXCL> flag is available
for
opening files in
exclusive mode. This is B<not> locking: exclusiveness means here that
if
the file already
exists
,
sysopen
() fails. C<O_EXCL> may not work
on network filesystems, and
has
no
effect
unless
the C<O_CREAT> flag
is set as well. Setting C<O_CREAT|O_EXCL> prevents the file from
being opened
if
it is a symbolic
link
. It does not protect against
symbolic links in the file's path.
Sometimes you may want to
truncate
an already-existing file. This
can be done using the C<O_TRUNC> flag. The behavior of
C<O_TRUNC>
with
C<O_RDONLY> is undefined.
You should seldom
if
ever
use
C<0644> as argument to C<
sysopen
>, because
that takes away the user's option to have a more permissive
umask
.
Better to omit it. See the perlfunc(1) entry on C<
umask
>
for
more
on this.
Note that C<
sysopen
> depends on the fdopen() C library function.
On many UNIX systems, fdopen() is known to fail
when
file descriptors
exceed a certain value, typically 255. If you need more file
descriptors than that, consider rebuilding Perl to
use
the C<sfio>
library, or perhaps using the POSIX::
open
() function.
See L<perlopentut>
for
a kinder, gentler explanation of opening files.
=item
sysread
FILEHANDLE,SCALAR,LENGTH,OFFSET
=item
sysread
FILEHANDLE,SCALAR,LENGTH
Attempts to
read
LENGTH bytes of data into variable SCALAR from the
specified FILEHANDLE, using the
system
call
read
(2). It bypasses
buffered IO, so mixing this
with
other kinds of reads, C<
print
>,
C<
write
>, C<
seek
>, C<
tell
>, or C<
eof
> can cause confusion because the
perlio or stdio layers usually buffers data. Returns the number of
bytes actually
read
, C<0> at end of file, or
undef
if
there was an
error (in the latter case C<$!> is also set). SCALAR will be grown or
shrunk so that the
last
byte actually
read
is the
last
byte of the
scalar
after
the
read
.
An OFFSET may be specified to place the
read
data at some place in the
string other than the beginning. A negative OFFSET specifies
placement at that many characters counting backwards from the end of
the string. A positive OFFSET greater than the
length
of SCALAR
results in the string being padded to the required size
with
C<
"\0"
>
bytes
before
the result of the
read
is appended.
There is
no
syseof() function, which is ok, since
eof
() doesn't work
very well on device files (like ttys) anyway. Use
sysread
() and check
for
a
return
value
for
0 to decide whether you're done.
Note that
if
the filehandle
has
been marked as C<:utf8> Unicode
characters are
read
instead of bytes (the LENGTH, OFFSET, and the
return
value of
sysread
() are in Unicode characters).
The C<:encoding(...)> layer implicitly introduces the C<:utf8> layer.
See L</
binmode
>, L</
open
>, and the C<
open
> pragma, L<
open
>.
=item
sysseek
FILEHANDLE,POSITION,WHENCE
Sets FILEHANDLE's
system
position in bytes using the
system
call
lseek(2). FILEHANDLE may be an expression whose value gives the name
of the filehandle. The
values
for
WHENCE are C<0> to set the new
position to POSITION, C<1> to set the it to the current position plus
POSITION, and C<2> to set it to EOF plus POSITION (typically
negative).
Note the I<in bytes>: even
if
the filehandle
has
been set to operate
on characters (
for
example by using the C<:utf8> I/O layer),
tell
()
will
return
byte offsets, not character offsets (because implementing
that would render
sysseek
() very slow).
sysseek
() bypasses normal buffered IO, so mixing this
with
reads (other
than C<
sysread
>,
for
example C<< <> >> or
read
()) C<
print
>, C<
write
>,
C<
seek
>, C<
tell
>, or C<
eof
> may cause confusion.
For WHENCE, you may also
use
the constants C<SEEK_SET>, C<SEEK_CUR>,
and C<SEEK_END> (start of the file, current position, end of the file)
from the Fcntl module. Use of the constants is also more portable
than relying on 0, 1, and 2. For example to define a
"systell"
function:
sub
systell {
sysseek
(
$_
[0], 0, SEEK_CUR) }
Returns the new position, or the undefined value on failure. A position
of zero is returned as the string C<
"0 but true"
>; thus C<
sysseek
> returns
true on success and false on failure, yet you can still easily determine
the new position.
=item
system
LIST
=item
system
PROGRAM LIST
Does exactly the same thing as C<
exec
LIST>, except that a
fork
is
done first, and the parent process waits
for
the child process to
complete. Note that argument processing varies depending on the
number of arguments. If there is more than one argument in LIST,
or
if
LIST is an array
with
more than one value, starts the program
given
by the first element of the list
with
arguments
given
by the
rest of the list. If there is only one
scalar
argument, the argument
is checked
for
shell metacharacters, and
if
there are any, the
entire argument is passed to the
system
's command shell
for
parsing
(this is C</bin/sh -c> on Unix platforms, but varies on other
platforms). If there are
no
shell metacharacters in the argument,
it is
split
into words and passed directly to C<execvp>, which is
more efficient.
Beginning
with
v5.6.0, Perl will attempt to flush all files opened
for
output
before
any operation that may
do
a
fork
, but this may not be
supported on some platforms (see L<perlport>). To be safe, you may need
to set C<$|> (
$AUTOFLUSH
in English) or call the C<autoflush()> method
of C<IO::Handle> on any
open
handles.
The
return
value is the
exit
status of the program as returned by the
C<
wait
> call. To get the actual
exit
value,
shift
right by eight (see
below). See also L</
exec
>. This is I<not> what you want to
use
to capture
the output from a command,
for
that you should
use
merely backticks or
C<
qx//
>, as described in L<perlop/
"`STRING`"
>. Return value of -1
indicates a failure to start the program or an error of the
wait
(2)
system
call (inspect $!
for
the reason).
Like C<
exec
>, C<
system
> allows you to lie to a program about its name
if
you
use
the C<
system
PROGRAM LIST> syntax. Again, see L</
exec
>.
Since C<SIGINT> and C<SIGQUIT> are ignored during the execution of
C<
system
>,
if
you expect your program to terminate on receipt of these
signals you will need to arrange to
do
so yourself based on the
return
value.
@args
= (
"command"
,
"arg1"
,
"arg2"
);
system
(
@args
) == 0
or
die
"system @args failed: $?"
You can check all the failure possibilities by inspecting
C<$?> like this:
if
($? == -1) {
print
"failed to execute: $!\n"
;
}
elsif
($? & 127) {
printf
"child died with signal %d, %s coredump\n"
,
($? & 127), ($? & 128) ?
'with'
:
'without'
;
}
else
{
printf
"child exited with value %d\n"
, $? >> 8;
}
or more portably by using the W*() calls of the POSIX extension;
see L<perlport>
for
more information.
When the arguments get executed via the
system
shell, results
and
return
codes will be subject to its quirks and capabilities.
See L<perlop/
"`STRING`"
> and L</
exec
>
for
details.
=item
syswrite
FILEHANDLE,SCALAR,LENGTH,OFFSET
=item
syswrite
FILEHANDLE,SCALAR,LENGTH
=item
syswrite
FILEHANDLE,SCALAR
Attempts to
write
LENGTH bytes of data from variable SCALAR to the
specified FILEHANDLE, using the
system
call
write
(2). If LENGTH is
not specified, writes whole SCALAR. It bypasses buffered IO, so
mixing this
with
reads (other than C<
sysread
())>, C<
print
>, C<
write
>,
C<
seek
>, C<
tell
>, or C<
eof
> may cause confusion because the perlio and
stdio layers usually buffers data. Returns the number of bytes
actually written, or C<
undef
>
if
there was an error (in this case the
errno variable C<$!> is also set). If the LENGTH is greater than the
available data in the SCALAR
after
the OFFSET, only as much data as is
available will be written.
An OFFSET may be specified to
write
the data from some part of the
string other than the beginning. A negative OFFSET specifies writing
that many characters counting backwards from the end of the string.
In the case the SCALAR is empty you can
use
OFFSET but only zero offset.
Note that
if
the filehandle
has
been marked as C<:utf8>, Unicode
characters are written instead of bytes (the LENGTH, OFFSET, and the
return
value of
syswrite
() are in UTF-8 encoded Unicode characters).
The C<:encoding(...)> layer implicitly introduces the C<:utf8> layer.
See L</
binmode
>, L</
open
>, and the C<
open
> pragma, L<
open
>.
=item
tell
FILEHANDLE
=item
tell
Returns the current position I<in bytes>
for
FILEHANDLE, or -1 on
error. FILEHANDLE may be an expression whose value gives the name of
the actual filehandle. If FILEHANDLE is omitted, assumes the file
last
read
.
Note the I<in bytes>: even
if
the filehandle
has
been set to
operate on characters (
for
example by using the C<:utf8>
open
layer),
tell
() will
return
byte offsets, not character offsets
(because that would render
seek
() and
tell
() rather slow).
The
return
value of
tell
()
for
the standard streams like the STDIN
depends on the operating
system
: it may
return
-1 or something
else
.
tell
() on pipes, fifos, and sockets usually returns -1.
There is
no
C<systell> function. Use C<
sysseek
(FH, 0, 1)>
for
that.
Do not
use
tell
() (or other buffered I/O operations) on a file handle
that
has
been manipulated by
sysread
(),
syswrite
() or
sysseek
().
Those functions ignore the buffering,
while
tell
() does not.
=item
telldir
DIRHANDLE
Returns the current position of the C<
readdir
> routines on DIRHANDLE.
Value may be
given
to C<
seekdir
> to access a particular location in a
directory. C<
telldir
>
has
the same caveats about possible directory
compaction as the corresponding
system
library routine.
=item
tie
VARIABLE,CLASSNAME,LIST
This function binds a variable to a
package
class that will provide the
implementation
for
the variable. VARIABLE is the name of the variable
to be enchanted. CLASSNAME is the name of a class implementing objects
of correct type. Any additional arguments are passed to the C<new>
method of the class (meaning C<TIESCALAR>, C<TIEHANDLE>, C<TIEARRAY>,
or C<TIEHASH>). Typically these are arguments such as might be passed
to the C<dbm_open()> function of C. The object returned by the C<new>
method is also returned by the C<
tie
> function, which would be useful
if
you want to access other methods in CLASSNAME.
Note that functions such as C<
keys
> and C<
values
> may
return
huge lists
when
used on large objects, like DBM files. You may prefer to
use
the
C<
each
> function to iterate over such. Example:
tie
(
%HIST
,
'NDBM_File'
,
'/usr/lib/news/history'
, 1, 0);
while
((
$key
,
$val
) =
each
%HIST
) {
print
$key
,
' = '
,
unpack
(
'L'
,
$val
),
"\n"
;
}
untie
(
%HIST
);
A class implementing a hash should have the following methods:
TIEHASH classname, LIST
FETCH this, key
STORE this, key, value
DELETE this, key
CLEAR this
EXISTS this, key
FIRSTKEY this
NEXTKEY this, lastkey
SCALAR this
DESTROY this
UNTIE this
A class implementing an ordinary array should have the following methods:
TIEARRAY classname, LIST
FETCH this, key
STORE this, key, value
FETCHSIZE this
STORESIZE this, count
CLEAR this
PUSH this, LIST
POP this
SHIFT this
UNSHIFT this, LIST
SPLICE this, offset,
length
, LIST
EXTEND this, count
DESTROY this
UNTIE this
A class implementing a file handle should have the following methods:
TIEHANDLE classname, LIST
READ this,
scalar
,
length
, offset
READLINE this
GETC this
WRITE this,
scalar
,
length
, offset
PRINT this, LIST
PRINTF this,
format
, LIST
BINMODE this
EOF this
FILENO this
SEEK this, position, whence
TELL this
OPEN this, mode, LIST
CLOSE this
DESTROY this
UNTIE this
A class implementing a
scalar
should have the following methods:
TIESCALAR classname, LIST
FETCH this,
STORE this, value
DESTROY this
UNTIE this
Not all methods indicated above need be implemented. See L<perltie>,
L<Tie::Hash>, L<Tie::Array>, L<Tie::Scalar>, and L<Tie::Handle>.
Unlike C<
dbmopen
>, the C<
tie
> function will not
use
or
require
a module
for
you--you need to
do
that explicitly yourself. See L<DB_File>
or the F<Config> module
for
interesting C<
tie
> implementations.
For further details see L<perltie>, L<
"tied VARIABLE"
>.
=item
tied
VARIABLE
Returns a reference to the object underlying VARIABLE (the same value
that was originally returned by the C<
tie
> call that bound the variable
to a
package
.) Returns the undefined value
if
VARIABLE isn't
tied
to a
package
.
=item
time
Returns the number of non-leap seconds since whatever
time
the
system
considers to be the epoch, suitable
for
feeding to C<
gmtime
> and
C<
localtime
>. On most systems the epoch is 00:00:00 UTC, January 1, 1970;
a prominent exception being Mac OS Classic which uses 00:00:00, January 1,
1904 in the current
local
time
zone
for
its epoch.
For measuring
time
in better granularity than one second,
you may
use
either the Time::HiRes module (from CPAN, and starting from
Perl 5.8 part of the standard distribution), or
if
you have
gettimeofday(2), you may be able to
use
the C<
syscall
> interface of Perl.
See L<perlfaq8>
for
details.
=item
times
Returns a four-element list giving the user and
system
times
, in
seconds,
for
this process and the children of this process.
(
$user
,
$system
,
$cuser
,
$csystem
) =
times
;
In
scalar
context, C<
times
> returns C<
$user
>.
=item
tr
///
The transliteration operator. Same as C<y///>. See L<perlop>.
=item
truncate
FILEHANDLE,LENGTH
=item
truncate
EXPR,LENGTH
Truncates the file opened on FILEHANDLE, or named by EXPR, to the
specified
length
. Produces a fatal error
if
truncate
isn't implemented
on your
system
. Returns true
if
successful, the undefined value
otherwise.
The behavior is undefined
if
LENGTH is greater than the
length
of the
file.
=item
uc
EXPR
=item
uc
Returns an uppercased version of EXPR. This is the internal function
implementing the C<\U> escape in double-quoted strings. Respects
current LC_CTYPE locale
if
C<
use
locale> in force. See L<perllocale>
and L<perlunicode>
for
more details about locale and Unicode support.
It does not attempt to
do
titlecase mapping on initial letters. See
C<
ucfirst
>
for
that.
If EXPR is omitted, uses C<
$_
>.
=item
ucfirst
EXPR
=item
ucfirst
Returns the value of EXPR
with
the first character in uppercase
(titlecase in Unicode). This is the internal function implementing
the C<\u> escape in double-quoted strings. Respects current LC_CTYPE
locale
if
C<
use
locale> in force. See L<perllocale> and L<perlunicode>
for
more details about locale and Unicode support.
If EXPR is omitted, uses C<
$_
>.
=item
umask
EXPR
=item
umask
Sets the
umask
for
the process to EXPR and returns the previous value.
If EXPR is omitted, merely returns the current
umask
.
The Unix permission C<rwxr-x---> is represented as three sets of three
bits, or three octal digits: C<0750> (the leading 0 indicates octal
and isn't one of the digits). The C<
umask
> value is such a number
representing disabled permissions bits. The permission (or
"mode"
)
values
you pass C<
mkdir
> or C<
sysopen
> are modified by your
umask
, so
even
if
you
tell
C<
sysopen
> to create a file
with
permissions C<0777>,
if
your
umask
is C<0022> then the file will actually be created
with
permissions C<0755>. If your C<
umask
> were C<0027> (group can't
write
; others can't
read
,
write
, or execute), then passing
C<
sysopen
> C<0666> would create a file
with
mode C<0640> (C<0666 &~
027> is C<0640>).
Here's some advice: supply a creation mode of C<0666>
for
regular
files (in C<
sysopen
>) and one of C<0777>
for
directories (in
C<
mkdir
>) and executable files. This gives users the freedom of
choice:
if
they want protected files, they might choose process umasks
of C<022>, C<027>, or even the particularly antisocial mask of C<077>.
Programs should rarely
if
ever make policy decisions better left to
the user. The exception to this is
when
writing files that should be
kept private: mail files, web browser cookies, I<.rhosts> files, and
so on.
If
umask
(2) is not implemented on your
system
and you are trying to
restrict access
for
I<yourself> (i.e., (EXPR & 0700) > 0), produces a
fatal error at run
time
. If
umask
(2) is not implemented and you are
not trying to restrict access
for
yourself, returns C<
undef
>.
Remember that a
umask
is a number, usually
given
in octal; it is I<not> a
string of octal digits. See also L</
oct
>,
if
all you have is a string.
=item
undef
EXPR
=item
undef
Undefines the value of EXPR, which must be an lvalue. Use only on a
scalar
value, an array (using C<@>), a hash (using C<%>), a subroutine
(using C<&>), or a typeglob (using C<*>). (Saying C<
undef
$hash
{
$key
}>
will probably not
do
what you expect on most predefined variables or
DBM list
values
, so don't
do
that; see L<
delete
>.) Always returns the
undefined value. You can omit the EXPR, in which case nothing is
undefined, but you still get an undefined value that you could,
for
instance,
return
from a subroutine, assign to a variable or pass as a
parameter. Examples:
undef
$foo
;
undef
$bar
{
'blurfl'
};
undef
@ary
;
undef
%hash
;
undef
&mysub
;
undef
*xyz
;
return
(
wantarray
? (
undef
,
$errmsg
) :
undef
)
if
$they_blew_it
;
select
undef
,
undef
,
undef
, 0.25;
(
$a
,
$b
,
undef
,
$c
) =
&foo
;
Note that this is a unary operator, not a list operator.
=item
unlink
LIST
=item
unlink
Deletes a list of files. Returns the number of files successfully
deleted.
$cnt
=
unlink
'a'
,
'b'
,
'c'
;
unlink
@goners
;
unlink
<*.bak>;
Note: C<
unlink
> will not
delete
directories
unless
you are superuser and
the B<-U> flag is supplied to Perl. Even
if
these conditions are
met, be warned that unlinking a directory can inflict damage on your
filesystem. Use C<
rmdir
> instead.
If LIST is omitted, uses C<
$_
>.
=item
unpack
TEMPLATE,EXPR
C<
unpack
> does the
reverse
of C<
pack
>: it takes a string
and expands it out into a list of
values
.
(In
scalar
context, it returns merely the first value produced.)
The string is broken into chunks described by the TEMPLATE. Each chunk
is converted separately to a value. Typically, either the string is a result
of C<
pack
>, or the bytes of the string represent a C structure of some
kind.
The TEMPLATE
has
the same
format
as in the C<
pack
> function.
Here's a subroutine that does substring:
sub
substr
{
my
(
$what
,
$where
,
$howmuch
) =
@_
;
unpack
(
"x$where a$howmuch"
,
$what
);
}
and then there's
sub
ordinal {
unpack
(
"c"
,
$_
[0]); }
In addition to fields allowed in
pack
(), you may prefix a field
with
a %<number> to indicate that
you want a <number>-bit checksum of the items instead of the items
themselves. Default is a 16-bit checksum. Checksum is calculated by
summing numeric
values
of expanded
values
(
for
string fields the sum of
C<
ord
(
$char
)> is taken,
for
bit fields the sum of zeroes and ones).
For example, the following
computes the same number as the System V sum program:
$checksum
=
do
{
local
$/;
unpack
(
"%32C*"
,<>) % 65535;
};
The following efficiently counts the number of set bits in a bit vector:
$setbits
=
unpack
(
"%32b*"
,
$selectmask
);
The C<p> and C<P> formats should be used
with
care. Since Perl
has
no
way of checking whether the value passed to C<
unpack
()>
corresponds to a valid memory location, passing a pointer value that's
not known to be valid is likely to have disastrous consequences.
If there are more
pack
codes or
if
the repeat count of a field or a group
is larger than what the remainder of the input string allows, the result
is not well
defined
: in some cases, the repeat count is decreased, or
C<
unpack
()> will produce null strings or zeroes, or terminate
with
an
error. If the input string is longer than one described by the TEMPLATE,
the rest is ignored.
See L</
pack
>
for
more examples and notes.
=item
untie
VARIABLE
Breaks the binding between a variable and a
package
. (See C<
tie
>.)
Has
no
effect
if
the variable is not
tied
.
=item
unshift
ARRAY,LIST
Does the opposite of a C<
shift
>. Or the opposite of a C<
push
>,
depending on how you look at it. Prepends list to the front of the
array, and returns the new number of elements in the array.
unshift
(
@ARGV
,
'-e'
)
unless
$ARGV
[0] =~ /^-/;
Note the LIST is prepended whole, not one element at a
time
, so the
prepended elements stay in the same order. Use C<
reverse
> to
do
the
reverse
.
Imports some semantics into the current
package
from the named module,
generally by aliasing certain subroutine or variable names into your
package
. It is exactly equivalent to
BEGIN {
require
Module;
import
Module LIST; }
except that Module I<must> be a bareword.
VERSION may be either a numeric argument such as 5.006, which will be
compared to C<$]>, or a literal of the form v5.6.1, which will be compared
to C<$^V> (aka
$PERL_VERSION
. A fatal error is produced
if
VERSION is
greater than the version of the current Perl interpreter; Perl will not
attempt to parse the rest of the file. Compare
with
L</
require
>, which can
do
a similar check at run
time
.
Specifying VERSION as a literal of the form v5.6.1 should generally be
avoided, because it leads to misleading error messages under earlier
versions of Perl that
do
not support this syntax. The equivalent numeric
version should be used instead.
use
5.6.1;
use
5.006_001;
This is often useful
if
you need to check the current Perl version
before
C<
use
>ing library modules that have changed in incompatible ways from
older versions of Perl. (We
try
not to
do
this more than we have to.)
The C<BEGIN> forces the C<
require
> and C<
import
> to happen at compile
time
. The
C<
require
> makes sure the module is loaded into memory
if
it hasn't been
yet. The C<
import
> is not a builtin--it's just an ordinary static method
call into the C<Module>
package
to
tell
the module to
import
the list of
features back into the current
package
. The module can implement its
C<
import
> method any way it likes, though most modules just choose to
derive their C<
import
> method via inheritance from the C<Exporter> class that
is
defined
in the C<Exporter> module. See L<Exporter>. If
no
C<
import
>
method can be found then the call is skipped.
If you
do
not want to call the
package
's C<
import
> method (
for
instance,
to stop your namespace from being altered), explicitly supply the empty list:
That is exactly equivalent to
If the VERSION argument is present between Module and LIST, then the
C<
use
> will call the VERSION method in class Module
with
the
given
version as an argument. The
default
VERSION method, inherited from
the UNIVERSAL class, croaks
if
the
given
version is larger than the
value of the variable C<
$Module::VERSION
>.
Again, there is a distinction between omitting LIST (C<
import
> called
with
no
arguments) and an explicit empty LIST C<()> (C<
import
> not
called). Note that there is
no
comma
after
VERSION!
Because this is a wide-
open
interface, pragmas (compiler directives)
are also implemented this way. Currently implemented pragmas are:
use
strict
qw(subs vars refs)
;
use
subs
qw(afunc blurfl)
;
use
sort
qw(stable _quicksort _mergesort)
;
Some of these pseudo-modules
import
semantics into the current
block scope (like C<strict> or C<integer>, unlike ordinary modules,
which
import
symbols into the current
package
(which are effective
through the end of the file).
There's a corresponding C<
no
> command that unimports meanings imported
by C<
use
>, i.e., it calls C<unimport Module LIST> instead of C<
import
>.
no
integer;
no
strict
'refs'
;
no
warnings;
See L<perlmodlib>
for
a list of standard modules and pragmas. See L<perlrun>
for
the C<-M> and C<-m> command-line options to perl that give C<
use
>
functionality from the command-line.
=item
utime
LIST
Changes the access and modification
times
on
each
file of a list of
files. The first two elements of the list must be the NUMERICAL access
and modification
times
, in that order. Returns the number of files
successfully changed. The inode change
time
of
each
file is set
to the current
time
. For example, this code
has
the same effect as the
Unix touch(1) command
when
the files I<already exist> and belong to
the user running the program:
$atime
=
$mtime
=
time
;
utime
$atime
,
$mtime
,
@ARGV
;
Since perl 5.7.2,
if
the first two elements of the list are C<
undef
>, then
the
utime
(2) function in the C library will be called
with
a null second
argument. On most systems, this will set the file's access and
modification
times
to the current
time
(i.e. equivalent to the example
above) and will even work on other users' files where you have
write
permission:
utime
undef
,
undef
,
@ARGV
;
Under NFS this will
use
the
time
of the NFS server, not the
time
of
the
local
machine. If there is a
time
synchronization problem, the
NFS server and
local
machine will have different
times
. The Unix
touch(1) command will in fact normally
use
this form instead of the
one shown in the first example.
Note that only passing one of the first two elements as C<
undef
> will
be equivalent of passing it as 0 and will not have the same effect as
described
when
they are both C<
undef
>. This case will also trigger an
uninitialized warning.
=item
values
HASH
Returns a list consisting of all the
values
of the named hash.
(In a
scalar
context, returns the number of
values
.)
The
values
are returned in an apparently random order. The actual
random order is subject to change in future versions of perl, but it
is guaranteed to be the same order as either the C<
keys
> or C<
each
>
function would produce on the same (unmodified) hash. Since Perl
5.8.1 the ordering is different even between different runs of Perl
for
security reasons (see L<perlsec/
"Algorithmic Complexity Attacks"
>).
As a side effect, calling
values
() resets the HASH's internal iterator,
see L</
each
>. (In particular, calling
values
() in void context resets
the iterator
with
no
other overhead.)
Note that the
values
are not copied, which means modifying them will
modify the contents of the hash:
for
(
values
%hash
) { s/foo/bar/g }
for
(
@hash
{
keys
%hash
}) { s/foo/bar/g }
See also C<
keys
>, C<
each
>, and C<
sort
>.
=item
vec
EXPR,OFFSET,BITS
Treats the string in EXPR as a bit vector made up of elements of
width BITS, and returns the value of the element specified by OFFSET
as an unsigned integer. BITS therefore specifies the number of bits
that are reserved
for
each
element in the bit vector. This must
be a power of two from 1 to 32 (or 64,
if
your platform supports
that).
If BITS is 8,
"elements"
coincide
with
bytes of the input string.
If BITS is 16 or more, bytes of the input string are grouped into chunks
of size BITS/8, and
each
group is converted to a number as
with
pack
()/
unpack
()
with
big-endian formats C<n>/C<N> (and analogously
for
BITS==64). See L<
"pack"
>
for
details.
If bits is 4 or less, the string is broken into bytes, then the bits
of
each
byte are broken into 8/BITS groups. Bits of a byte are
numbered in a little-endian-ish way, as in C<0x01>, C<0x02>,
C<0x04>, C<0x08>, C<0x10>, C<0x20>, C<0x40>, C<0x80>. For example,
breaking the single input byte C<
chr
(0x36)> into two groups gives a list
C<(0x6, 0x3)>; breaking it into 4 groups gives C<(0x2, 0x1, 0x3, 0x0)>.
C<
vec
> may also be assigned to, in which case parentheses are needed
to give the expression the correct precedence as in
vec
(
$image
,
$max_x
*
$x
+
$y
, 8) = 3;
If the selected element is outside the string, the value 0 is returned.
If an element off the end of the string is written to, Perl will first
extend the string
with
sufficiently many zero bytes. It is an error
to
try
to
write
off the beginning of the string (i.e. negative OFFSET).
The string should not contain any character
with
the value > 255 (which
can only happen
if
you're using UTF-8 encoding). If it does, it will be
treated as something that is not UTF-8 encoded. When the C<
vec
> was
assigned to, other parts of your program will also
no
longer consider the
string to be UTF-8 encoded. In other words,
if
you
do
have such characters
in your string,
vec
() will operate on the actual byte string, and not the
conceptual character string.
Strings created
with
C<
vec
> can also be manipulated
with
the logical
operators C<|>, C<&>, C<^>, and C<~>. These operators will assume a bit
vector operation is desired
when
both operands are strings.
See L<perlop/
"Bitwise String Operators"
>.
The following code will build up an ASCII string saying C<
'PerlPerlPerl'
>.
The comments show the string
after
each
step. Note that this code works
in the same way on big-endian or little-endian machines.
my
$foo
=
''
;
vec
(
$foo
, 0, 32) = 0x5065726C;
print
vec
(
$foo
, 0, 8);
vec
(
$foo
, 2, 16) = 0x5065;
vec
(
$foo
, 3, 16) = 0x726C;
vec
(
$foo
, 8, 8) = 0x50;
vec
(
$foo
, 9, 8) = 0x65;
vec
(
$foo
, 20, 4) = 2;
vec
(
$foo
, 21, 4) = 7;
vec
(
$foo
, 45, 2) = 3;
vec
(
$foo
, 93, 1) = 1;
vec
(
$foo
, 94, 1) = 1;
To transform a bit vector into a string or list of 0
's and 1'
s,
use
these:
$bits
=
unpack
(
"b*"
,
$vector
);
@bits
=
split
(//,
unpack
(
"b*"
,
$vector
));
If you know the exact
length
in bits, it can be used in place of the C<*>.
Here is an example to illustrate how the bits actually fall in place:
print
<<
'EOT'
;
0 1 2 3
unpack
(
"V"
,
$_
) 01234567890123456789012345678901
------------------------------------------------------------------
EOT
for
$w
(0..3) {
$width
= 2*
*$w
;
for
(
$shift
=0;
$shift
<
$width
; ++
$shift
) {
for
(
$off
=0;
$off
< 32/
$width
; ++
$off
) {
$str
=
pack
(
"B*"
,
"0"
x32);
$bits
= (1<<
$shift
);
vec
(
$str
,
$off
,
$width
) =
$bits
;
$res
=
unpack
(
"b*"
,
$str
);
$val
=
unpack
(
"V"
,
$str
);
write
;
}
}
}
format
STDOUT =
vec
(
$_
,@
$off
,
$width
,
$bits
,
$val
,
$res
.