NAME
Esjis - Run-time routines for Sjis.pm
SYNOPSIS
use
Esjis;
Esjis::
split
(...);
Esjis::
tr
(...);
Esjis::
chop
(...);
Esjis::
index
(...);
Esjis::
rindex
(...);
Esjis::
lc
(...);
Esjis::lc_;
Esjis::
lcfirst
(...);
Esjis::lcfirst_;
Esjis::
uc
(...);
Esjis::uc_;
Esjis::
ucfirst
(...);
Esjis::ucfirst_;
Esjis::fc(...);
Esjis::fc_;
Esjis::ignorecase(...);
Esjis::capture(...);
Esjis::
chr
(...);
Esjis::chr_;
Esjis::X ...;
Esjis::X_;
Esjis::
glob
(...);
Esjis::glob_;
Esjis::
lstat
(...);
Esjis::lstat_;
Esjis::
opendir
(...);
Esjis::
stat
(...);
Esjis::stat_;
Esjis::
unlink
(...);
Esjis::
chdir
(...);
Esjis::
do
(...);
Esjis::
require
(...);
Esjis::
telldir
(...);
# "no Esjis;" not supported
ABSTRACT
This module has run-time routines for use Sjis software automatically, you do not have to use.
BUGS AND LIMITATIONS
I have tested and verified this software using the best of my ability. However, a software containing much regular expression is bound to contain some bugs. Thus, if you happen to find a bug that's in Sjis software and not your own program, you can try to reduce it to a minimal test case and then report it to the following author's address. If you have an idea that could make this a more useful tool, please let everyone share it.
HISTORY
This Esjis module first appeared in ActivePerl Build 522 Built under MSWin32 Compiled at Nov 2 1999 09:52:28
AUTHOR
INABA Hitoshi <ina@cpan.org>
This project was originated by INABA Hitoshi. For any questions, use <ina@cpan.org> so we can share this file.
LICENSE AND COPYRIGHT
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
EXAMPLES
Split string
@split
= Esjis::
split
(/pattern/,
$string
,
$limit
);
@split
= Esjis::
split
(/pattern/,
$string
);
@split
= Esjis::
split
(/pattern/);
@split
= Esjis::
split
(
''
,
$string
,
$limit
);
@split
= Esjis::
split
(
''
,
$string
);
@split
= Esjis::
split
(
''
);
@split
= Esjis::
split
();
@split
= Esjis::
split
;
This subroutine scans a string
given
by
$string
for
separators, and splits the
string into a list of substring, returning the resulting list value in list
context or the count of substring in
scalar
context. Scalar context also causes
split
to
write
its result to
@_
, but this usage is deprecated. The separators
are determined by repeated pattern matching, using the regular expression
given
in /pattern/, so the separators may be of any size and need not be the same
string on every match. (The separators are not ordinarily returned; exceptions
are discussed later in this section.) If the /pattern/ doesn't match the string
at all, Esjis::
split
returns the original string as a single substring, If it
matches once, you get two substrings, and so on. You may supply regular
expression modifiers to the /pattern/, like /pattern/i, /pattern/x, etc. The
//m modifier is assumed
when
you
split
on the pattern /^/.
If
$limit
is specified and positive, the subroutine splits into
no
more than that
many fields (though it may
split
into fewer
if
it runs out of separators). If
$limit
is negative, it is treated as
if
an arbitrarily large
$limit
has
been
specified If
$limit
is omitted or zero, trailing null fields are stripped from
the result (which potential users of
pop
would
do
wel to remember). If
$string
is omitted, the subroutine splits the
$_
string. If /pattern/ is also omitted or
is the literal space,
" "
, the subroutine
split
on whitespace, /\s+/,
after
skipping any leading whitespace.
A /pattern/ of /^/ is secretly treated
if
it it were /^/m, since it isn't much
use
otherwise.
String of any
length
can be
split
:
@chars
= Esjis::
split
(//,
$word
);
@fields
= Esjis::
split
(/:/,
$line
);
@words
= Esjis::
split
(
" "
,
$paragraph
);
@lines
= Esjis::
split
(/^/,
$buffer
);
A pattern capable of matching either the null string or something longer than
the null string (
for
instance, a pattern consisting of any single character
modified by a * or ?) will
split
the value of
$string
into separate characters
wherever it matches the null string between characters; nonnull matches will
skip over the matched separator characters in the usual fashion. (In other words,
a pattern won't match in one spot more than once, even
if
it matched
with
a zero
width.) For example:
print
join
(
":"
=> Esjis::
split
(/ */,
"hi there"
));
produces the output
"h:i:t:h:e:r:e"
. The space disappers because it matches
as part of the separator. As a trivial case, the null pattern // simply splits
into separate characters, and spaces
do
not disappear. (For normal pattern
matches, a // pattern would repeat the
last
successfully matched pattern, but
Esjis::
split
's pattern is exempt from that wrinkle.)
The
$limit
parameter splits only part of a string:
my
(
$login
,
$passwd
,
$remainder
) = Esjis::
split
(/:/,
$_
, 3);
We encourage you to
split
to lists of names like this to make your code
self-documenting. (For purposes of error checking, note that
$remainder
would
be undefined
if
there were fewer than three fields.) When assigning to a list,
if
$limit
is omitted, Perl supplies a
$limit
one larger than the number of
variables in the list, to avoid unneccessary work. For the
split
above,
$limit
would have been 4 by
default
, and
$remainder
would have received only the third
field, not all the rest of the fields. In
time
-critical applications, it behooves
you not to
split
into more fields than you really need. (The trouble
with
powerful languages it that they let you be powerfully stupid at
times
.)
We said earlier that the separators are not returned, but
if
the /pattern/
contains parentheses, then the substring matched by
each
pair of parentheses is
included in the resulting list, interspersed
with
the fields that are ordinarily
returned. Here's a simple example:
Esjis::
split
(/([-,])/,
"1-10,20"
);
which produces the list value:
(1,
"-"
, 10,
","
, 20)
With more parentheses, a field is returned
for
each
pair, even
if
some pairs
don't match, in which case undefined
values
are returned in those positions. So
if
you
say
:
Esjis::
split
(/(-)|(,)/,
"1-10,20"
);
you get the value:
(1,
"-"
,
undef
, 10,
undef
,
","
, 20)
The /pattern/ argument may be replaced
with
an expression to specify patterns
that vary at runtime. As
with
ordinary patterns, to
do
run-
time
compilation only
once,
use
/
$variable
/o.
As a special case,
if
the expression is a single space (
" "
), the subroutine
splits on whitespace just as Esjis::
split
with
no
arguments does. Thus,
Esjis::
split
(
" "
) can be used to emulate awk's
default
behavior. In contrast,
Esjis::
split
(/ /) will give you as many null initial fields as there are
leading spaces. (Other than this special case,
if
you supply a string instead
of a regular expression, it'll be interpreted as a regular expression anyway.)
string and to collapse intervaning stretches of whitespace into a single
space:
$string
=
join
(
" "
, Esjis::
split
(
" "
,
$string
));
The following example splits an RFC822 message header into a hash containing
$head
{
'Date'
},
$head
{
'Subject'
}, and so on. It uses the trick of assigning a
list of pairs to a hash, because separators altinate
with
separated fields, It
users parentheses to
return
part of
each
separator as part of the returned list
value. Since the
split
pattern is guaranteed to
return
things in pairs by virtue
of containing one set of parentheses, the hash assignment is guaranteed to
receive a list consisting of key/value pairs, where
each
key is the name of a
header field. (Unfortunately, this technique loses information
for
multiple lines
with
the same key field, such as Received-By lines. Ah well)
$header
=~ s/\n\s+/ /g;
# Merge continuation lines.
%head
= (
"FRONTSTUFF"
, Esjis::
split
(/^(\S*?):\s*/m,
$header
));
The following example processes the entries in a Unix passwd(5) file. You could
leave out the
chomp
, in which case
$shell
would have a newline on the end of it.
open
(PASSWD,
"/etc/passwd"
);
while
(<PASSWD>) {
chomp
;
# remove trailing newline.
(
$login
,
$passwd
,
$uid
,
$gid
,
$gcos
,
$home
,
$shell
) =
Esjis::
split
(/:/);
...
}
Here's how process
each
word of
each
line of
each
file of input to create a
word-frequency hash.
while
(<>) {
for
my
$word
(Esjis::
split
()) {
$count
{
$word
}++;
}
}
The inverse of Esjis::
split
is
join
, except that
join
can only
join
with
the
same separator between all fields. To break apart a string
with
fixed-position
fields,
use
unpack
.
Processing long
$string
(over 32766 octets) requires Perl 5.010001 or later.
Transliteration
$tr
= Esjis::
tr
(
$variable
,
$bind_operator
,
$searchlist
,
$replacementlist
,
$modifier
);
$tr
= Esjis::
tr
(
$variable
,
$bind_operator
,
$searchlist
,
$replacementlist
);
This is the transliteration (sometimes erroneously called translation) operator,
which is like the y/// operator in the Unix sed program, only better, in
everybody's humble opinion.
This subroutine scans a ShiftJIS string character by character and replaces all
occurrences of the characters found in
$searchlist
with
the corresponding character
in
$replacementlist
. It returns the number of characters replaced or deleted.
If
no
ShiftJIS string is specified via =~ operator, the
$_
variable is translated.
$modifier
are:
---------------------------------------------------------------------------
Modifier Meaning
---------------------------------------------------------------------------
c Complement
$searchlist
.
d Delete found but unreplaced characters.
s Squash duplicate replaced characters.
r Return transliteration and leave the original string untouched.
---------------------------------------------------------------------------
print
Esjis::
tr
(
'bookkeeper'
,
'=~'
,
'boep'
,
'peob'
,
'r'
);
# prints 'peekkoobor'
Chop string
$chop
= Esjis::
chop
(
@list
);
$chop
= Esjis::
chop
();
$chop
= Esjis::
chop
;
This subroutine chops off the
last
character of a string variable and returns the
character chopped. The Esjis::
chop
subroutine is used primary to remove the newline
from the end of an input recoed, and it is more efficient than using a
substitution. If that
's all you'
re doing, then it would be safer to
use
chomp
,
since Esjis::
chop
always shortens the string
no
matter what's there, and
chomp
is more selective. If
no
argument is
given
, the subroutine chops the
$_
variable.
You cannot Esjis::
chop
a literal, only a variable. If you Esjis::
chop
a list of
variables,
each
string in the list is chopped:
@lines
= `cat myfile`;
Esjis::
chop
(
@lines
);
You can Esjis::
chop
anything that is an lvalue, including an assignment:
Esjis::
chop
(
$cwd
= `pwd`);
Esjis::
chop
(
$answer
= <STDIN>);
This is different from:
$answer
= Esjis::
chop
(
$tmp
= <STDIN>);
# WRONG
which puts a newline into
$answer
because Esjis::
chop
returns the character
chopped, not the remaining string (which is in
$tmp
). One way to get the result
intended here is
with
substr
:
$answer
=
substr
<STDIN>, 0, -1;
But this is more commonly written as:
Esjis::
chop
(
$answer
= <STDIN>);
In the most general case, Esjis::
chop
can be expressed using
substr
:
$last_code
= Esjis::
chop
(
$var
);
$last_code
=
substr
(
$var
, -1, 1,
""
);
# same thing
Esjis::
chop
more than one character,
use
substr
as an lvalue, assigning a null
string. The following removes the
last
five characters of
$caravan
:
substr
(
$caravan
, -5) =
''
;
The negative subscript causes
substr
to count from the end of the string instead
form of
substr
, creating something of a quintuple Esjis::
chop
;
$tail
=
substr
(
$caravan
, -5, 5,
''
);
This is all dangerous business dealing
with
characters instead of graphemes. Perl
doesn't really have a grapheme mode, so you have to deal
with
them yourself.
Index string
$byte_pos
= Esjis::
index
(
$string
,
$substr
,
$byte_offset
);
$byte_pos
= Esjis::
index
(
$string
,
$substr
);
This subroutine searches
for
one string within another. It returns the byte position
of the first occurrence of
$substring
in
$string
. The
$byte_offset
,
if
specified,
says how many bytes from the start to skip
before
beginning to look. Positions are
based at 0. If the substring is not found, the subroutine returns one less than the
base, ordinarily -1. To work your way through a string, you might
say
:
$byte_pos
= -1;
while
((
$byte_pos
= Esjis::
index
(
$string
,
$lookfor
,
$byte_pos
)) > -1) {
print
"Found at $byte_pos\n"
;
$byte_pos
++;
}
Reverse index string
$byte_pos
= Esjis::
rindex
(
$string
,
$substr
,
$byte_offset
);
$byte_pos
= Esjis::
rindex
(
$string
,
$substr
);
This subroutine works just like Esjis::
index
except that it returns the byte
position of the
last
occurrence of
$substring
in
$string
(a
reverse
Esjis::
index
).
The subroutine returns -1
if
$substring
is not found.
$byte_offset
,
if
specified,
is the rightmost byte position that may be returned. To work your way through a
string backward,
say
:
$byte_pos
=
length
(
$string
);
while
((
$byte_pos
= Sjis::
rindex
(
$string
,
$lookfor
,
$byte_pos
)) >= 0) {
print
"Found at $byte_pos\n"
;
$byte_pos
--;
}
Lower case string
$lc
= Esjis::
lc
(
$string
);
$lc
= Esjis::lc_;
This subroutine returns a lowercased version of ShiftJIS
$string
(or
$_
,
if
$string
is omitted). This is the internal subroutine implementing the \L escape
in double-quoted strings.
software.
Lower case first character of string
$lcfirst
= Esjis::
lcfirst
(
$string
);
$lcfirst
= Esjis::lcfirst_;
This subroutine returns a version of ShiftJIS
$string
with
the first character
lowercased (or
$_
,
if
$string
is omitted). This is the internal subroutine
implementing the \l escape in double-quoted strings.
Upper case string
$uc
= Esjis::
uc
(
$string
);
$uc
= Esjis::uc_;
This subroutine returns an uppercased version of ShiftJIS
$string
(or
$_
,
if
$string
is omitted). This is the internal subroutine implementing the \U escape
software.
Upper case first character of string
$ucfirst
= Esjis::
ucfirst
(
$string
);
$ucfirst
= Esjis::ucfirst_;
This subroutine returns a version of ShiftJIS
$string
with
the first character
titlecased and other characters left alone (or
$_
,
if
$string
is omitted).
Titlecase is
"Camel"
for
an initial capital that
has
(or expects to have)
lowercase characters following it, not uppercase ones. Exsamples are the first
letter of a sentence, of a person's name, of a newspaper headline, or of most
words in a title. Characters
with
no
titlecase mapping
return
the uppercase
mapping instead. This is the internal subroutine implementing the \u escape in
double-quoted strings.
To capitalize a string by mapping its first character to titlecase and the rest
to lowercase,
use
:
$titlecase
= Esjis::
ucfirst
(
substr
(
$word
,0,1)) . Esjis::
lc
(
substr
(
$word
,1));
or
$string
=~ s/(\w)((?>\w*))/\u$1\L$2/g;
Do not
use
:
$do_not_use
= Esjis::
ucfirst
(Esjis::
lc
(
$word
));
or
"\u\L$word"
, because that can produce a different and incorrect answer
with
certain characters. The titlecase of something that
's been lowercased doesn'
t
always produce the same thing titlecasing the original produces.
Because titlecasing only makes sense at the start of a string that's followed
by lowercase characters, we can't think of any reason you might want to titlecase
every character in a string.
See also P.287 A Case of Mistaken Identity
in Chapter 6: Unicode
of ISBN 978-0-596-00492-7 Programming Perl 4th Edition.
Fold case string
P.860 fc
in Chapter 27: Functions
of ISBN 978-0-596-00492-7 Programming Perl 4th Edition.
$fc
= Esjis::fc(
$string
);
$fc
= Esjis::fc_;
New to Sjis software, this subroutine returns the full Unicode-like casefold of
ShiftJIS
$string
(or
$_
,
if
omitted). This is the internal subroutine implementing
the \F escape in double-quoted strings.
Just as title-case is based on uppercase but different, foldcase is based on
lowercase but different. In ASCII there is a one-to-one mapping between only
two cases, but in other encoding there is a one-to-many mapping and between three
cases. Because that's too many combinations to check manually
each
time
, a fourth
casemap called foldcase was invented as a common intermediary
for
the other three.
It is not a case itself, but it is a casemap.
To compare whether two strings are the same without regard to case,
do
this:
Esjis::fc(
$a
) eq Esjis::fc(
$b
)
The reliable way to compare string case-insensitively was
with
the /i pattern
modifier, because Sjis software
has
always used casefolding semantics
for
case-insensitive pattern matches. Knowing this, you can emulate equality
comparisons like this:
sub
fc_eq ($$) {
my
(
$a
,
$b
) =
@_
;
return
$a
=~ /\A\Q
$b
\E\z/i;
}
Make ignore case string
@ignorecase
= Esjis::ignorecase(
@string
);
Make capture number
$capturenumber
= Esjis::capture(
$string
);
Make character
$chr
= Esjis::
chr
(
$code
);
$chr
= Esjis::chr_;
This subroutine returns a programmer-visible character, character represented by
that
$code
in the character set. For example, Esjis::
chr
(65) is
"A"
in either
File test subroutine Esjis::X
The following all subroutines function
when
the pathname ends
with
chr
(0x5C) on
MSWin32.
A file test subroutine is a unary function that takes one argument, either a
filename or a filehandle, and tests the associated file to see whether something
is true about it. If the argument is omitted, it tests
$_
. Unless otherwise
documented, it returns 1
for
true and
""
for
false, or the undefined value
if
the file doesn't exist or is otherwise inaccessible. Currently implemented file
test subroutines are listed in:
Available in MSWin32, MacOS, and UNIX-like systems
------------------------------------------------------------------------------
Subroutine and Prototype Meaning
------------------------------------------------------------------------------
Esjis::r(*), Esjis::r_() File or directory is readable by this (effective) user or group
Esjis::w(*), Esjis::w_() File or directory is writable by this (effective) user or group
Esjis::e(*), Esjis::e_() File or directory name
exists
Esjis::x(*), Esjis::x_() File or directory is executable by this (effective) user or group
Esjis::z(*), Esjis::z_() File
exists
and
has
zero size (always false
for
directories)
Esjis::f(*), Esjis::f_() Entry is a plain file
Esjis::d(*), Esjis::d_() Entry is a directory
------------------------------------------------------------------------------
Available in MacOS and UNIX-like systems
------------------------------------------------------------------------------
Subroutine and Prototype Meaning
------------------------------------------------------------------------------
Esjis::R(*), Esjis::R_() File or directory is readable by this real user or group
Same as Esjis::r(*), Esjis::r_() on MacOS
Esjis::W(*), Esjis::W_() File or directory is writable by this real user or group
Same as Esjis::w(*), Esjis::w_() on MacOS
Esjis::X(*), Esjis::X_() File or directory is executable by this real user or group
Same as Esjis::x(*), Esjis::x_() on MacOS
Esjis::l(*), Esjis::l_() Entry is a symbolic
link
Esjis::S(*), Esjis::S_() Entry is a
socket
------------------------------------------------------------------------------
Not available in MSWin32 and MacOS
------------------------------------------------------------------------------
Subroutine and Prototype Meaning
------------------------------------------------------------------------------
Esjis::o(*), Esjis::o_() File or directory is owned by this (effective) user
Esjis::O(*), Esjis::O_() File or directory is owned by this real user
Esjis::p(*), Esjis::p_() Entry is a named
pipe
(a
"fifo"
)
Esjis::b(*), Esjis::b_() Entry is a block-special file (like a mountable disk)
Esjis::c(*), Esjis::c_() Entry is a character-special file (like an I/O device)
Esjis::u(*), Esjis::u_() File or directory is setuid
Esjis::g(*), Esjis::g_() File or directory is setgid
Esjis::k(*), Esjis::k_() File or directory
has
the sticky bit set
------------------------------------------------------------------------------
The tests -T and -B takes a
try
at telling whether a file is text or binary.
But people who know a lot about filesystems know that there's
no
bit (at least
in UNIX-like operating systems) to indicate that a file is a binary or text file
--- so how can Perl
tell
?
The answer is that Perl cheats. As you might guess, it sometimes guesses wrong.
This incomplete thinking of file test operator -T and -B gave birth to UTF8 flag
of a later period.
The Esjis::T, Esjis::T_, Esjis::B, and Esjis::B_ work as follows. The first block
or so of the file is examined
for
strange chatracters such as
[\000-\007\013\016-\032\034-\037\377] (that don't look like ShiftJIS). If more
than 10% of the bytes appear to be strange, it's a
*maybe
* binary file;
otherwise, it's a
*maybe
* text file. Also, any file containing ASCII NUL(\0) or
\377 in the first block is considered a binary file. If Esjis::T or Esjis::B is
used on a filehandle, the current input (standard I/O or
"stdio"
) buffer is
examined rather than the first block of the file. Both Esjis::T and Esjis::B
return
1 as true on an empty file, or on a file at EOF (end-of-file)
when
testing
a filehandle. Both Esjis::T and Esjis::B doesn't work
when
given
the special
filehandle consisting of a solitary underline. Because Esjis::T
has
to
read
to
give you other kinds or grief. So on most occasions you'll want to test
with
a
Esjis::f first, as in:
next
unless
Esjis::f(
$file
) && Esjis::T(
$file
);
Available in MSWin32, MacOS, and UNIX-like systems
------------------------------------------------------------------------------
Subroutine and Prototype Meaning
------------------------------------------------------------------------------
Esjis::T(*), Esjis::T_() File looks like a
"text"
file
Esjis::B(*), Esjis::B_() File looks like a
"binary"
file
------------------------------------------------------------------------------
File ages
for
Esjis::M, Esjis::M_, Esjis::A, Esjis::A_, Esjis::C, and Esjis::C_
are returned in days (including fractional days) since the script started running.
This start
time
is stored in the special variable $^T (
$BASETIME
). Thus,
if
the
file changed
after
the script, you would get a negative
time
. Note that most
time
values
(86,399 out of 86,400, on average) are fractional, so testing
for
equality
with
an integer without using the
int
function is usually futile. Examples:
next
unless
Esjis::M(
$file
) > 0.5;
# files are older than 12 hours
&newfile
if
Esjis::M(
$file
) < 0;
# file is newer than process
&mailwarning
if
int
(Esjis::A_) == 90;
# file ($_) was accessed 90 days ago today
Available in MSWin32, MacOS, and UNIX-like systems
------------------------------------------------------------------------------
Subroutine and Prototype Meaning
------------------------------------------------------------------------------
Esjis::M(*), Esjis::M_() Modification age (measured in days)
Esjis::A(*), Esjis::A_() Access age (measured in days)
Same as Esjis::M(*), Esjis::M_() on MacOS
Esjis::C(*), Esjis::C_() Inode-modification age (measured in days)
------------------------------------------------------------------------------
The Esjis::s, and Esjis::s_ returns file size in bytes
if
succesful, or
undef
unless
successful.
Available in MSWin32, MacOS, and UNIX-like systems
------------------------------------------------------------------------------
Subroutine and Prototype Meaning
------------------------------------------------------------------------------
Esjis::s(*), Esjis::s_() File or directory
exists
and
has
nonzero size
(the value is the size in bytes)
------------------------------------------------------------------------------
Filename expansion (globbing)
@glob
= Esjis::
glob
(
$string
);
@glob
= Esjis::glob_;
This subroutine returns the value of
$string
with
filename expansions the way a
DOS-like shell would expand them, returning the
next
successive name on
each
call. If
$string
is omitted,
$_
is globbed instead. This is the internal
subroutine implementing the <*> and
glob
operator.
This subroutine function
when
the pathname ends
with
chr
(0x5C) on MSWin32.
For ease of
use
, the algorithm matches the DOS-like shell's style of expansion,
not the UNIX-like shell's. An asterisk (
"*"
) matches any sequence of any
character (including none). A question mark (
"?"
) matches any one character or
none. A tilde (
"~"
) expands to a home directory, as in
"~/.*rc"
for
all the
current user
's "rc" files, or "~jane/Mail/*" for all of Jane'
s mail files.
Note that all path components are case-insensitive, and that backslashes and
forward slashes are both accepted, and preserved. You may have to double the
backslashes
if
you are putting them in literally, due to double-quotish parsing
of the pattern by perl.
patterns such as <*.c *.h>. If you want to
glob
filenames that might contain
it. For example, to
glob
filenames that have an
"e"
followed by a space followed
@spacies
= <
"*e f*"
>;
@spacies
= Esjis::
glob
(
'"*e f*"'
);
@spacies
= Esjis::
glob
(
q("*e f*")
);
If you had to get a variable through, you could
do
this:
@spacies
= Esjis::
glob
(
"'*${var}e f*'"
);
@spacies
= Esjis::
glob
(
qq("*${var}e f*")
);
Another way on MSWin32
# relative path
@relpath_file
=
split
(/\n/,`dir /b wildcard\\here*.txt 2>NUL`);
# absolute path
@abspath_file
=
split
(/\n/,`dir /s /b wildcard\\here*.txt 2>NUL`);
# on COMMAND.COM
@relpath_file
=
split
(/\n/,`dir /b wildcard\\here*.txt`);
@abspath_file
=
split
(/\n/,`dir /s /b wildcard\\here*.txt`);
Statistics about link
@lstat
= Esjis::
lstat
(
$file
);
@lstat
= Esjis::lstat_;
Like Esjis::
stat
, returns information on file, except that
if
file is a symbolic
link
, Esjis::
lstat
returns information about the
link
; Esjis::
stat
returns
information about the file pointed to by the
link
. If symbolic links are
unimplemented on your
system
, a normal Esjis::
stat
is done instead. If file is
omitted, returns information on file
given
in
$_
. Returns
values
(especially
device and inode) may be bogus.
This subroutine function
when
the filename ends
with
chr
(0x5C) on MSWin32.
Open directory handle
$rc
= Esjis::
opendir
(DIR,
$dir
);
This subroutine opens a directory named
$dir
for
processing by
readdir
,
telldir
,
seekdir
,
rewinddir
, and
closedir
. The subroutine returns true
if
successful.
Directory handles have their own namespace from filehandles.
This subroutine function
when
the directory name ends
with
chr
(0x5C) on MSWin32.
Statistics about file
$stat
= Esjis::
stat
(FILEHANDLE);
$stat
= Esjis::
stat
(DIRHANDLE);
$stat
= Esjis::
stat
(
$expr
);
$stat
= Esjis::stat_;
@stat
= Esjis::
stat
(FILEHANDLE);
@stat
= Esjis::
stat
(DIRHANDLE);
@stat
= Esjis::
stat
(
$expr
);
@stat
= Esjis::stat_;
In
scalar
context, this subroutine returns a Boolean value that indicates whether
the call succeeded. In list context, it returns a 13-element list giving the
statistics
for
a file, either the file opened via FILEHANDLE or DIRHANDLE, or
named by
$expr
. It's typically used as followes:
(
$dev
,
$ino
,
$mode
,
$nlink
,
$uid
,
$gid
,
$rdev
,
$size
,
$atime
,
$mtime
,
$ctime
,
$blksize
,
$blocks
) = Esjis::
stat
(
$expr
);
Not all fields are supported on all filesystem types; unsupported fields
return
0. Here are the meanings of the fields:
-------------------------------------------------------------------------
Index Field Meaning
-------------------------------------------------------------------------
0
$dev
Device number of filesystem
drive number
for
MSWin32
vRefnum
for
MacOS
1
$ino
Inode number
zero
for
MSWin32
fileID/dirID
for
MacOS
2
$mode
File mode (type and permissions)
3
$nlink
Nunmer of (hard) links to the file
usually one
for
MSWin32 --- NTFS filesystems may
have a value greater than one
1
for
MacOS
4
$uid
Numeric user ID of file's owner
zero
for
MSWin32
zero
for
MacOS
5
$gid
Numeric group ID of file's owner
zero
for
MSWin32
zero
for
MacOS
6
$rdev
The device identifier (special files only)
drive number
for
MSWin32
NULL
for
MacOS
7
$size
Total size of file, in bytes
8
$atime
Last access
time
since the epoch
same as
$mtime
for
MacOS
9
$mtime
Last modification
time
since the epoch
since 1904-01-01 00:00:00
for
MacOS
10
$ctime
Inode change
time
(not creation
time
!) since the epoch
creation
time
instead of inode change
time
for
MSWin32
since 1904-01-01 00:00:00
for
MacOS
11
$blksize
Preferred blocksize
for
file
system
I/O
zero
for
MSWin32
12
$blocks
Actual number of blocks allocated
zero
for
MSWin32
int
((
$size
+
$blksize
-1) /
$blksize
)
for
MacOS
-------------------------------------------------------------------------
$dev
and
$ino
, token together, uniquely identify a file on the same
system
.
The
$blksize
and
$blocks
are likely
defined
only on BSD-derived filesystems.
The
$blocks
field (
if
defined
) is reported in 512-byte blocks. The value of
$blocks
* 512 can differ greatly from
$size
for
files containing unallocated
blocks, or
"hole"
, which aren't counted in
$blocks
.
If Esjis::
stat
is passed the special filehandle consisting of an underline,
no
actual
stat
(2) is done, but the current contents of the
stat
structure from
the
last
Esjis::
stat
, Esjis::
lstat
, or Esjis::
stat
-based file test subroutine
(such as Esjis::r, Esjis::w, and Esjis::x) are returned.
Because the mode contains both the file type and its permissions, you should
mask off the file type portion and
printf
or
sprintf
using a
"%o"
if
you want
to see the real permissions:
$mode
= (Esjis::
stat
(
$expr
))[2];
printf
"Permissions are %04o\n"
,
$mode
& 07777;
If
$expr
is omitted, returns information on file
given
in
$_
.
This subroutine function
when
the filename ends
with
chr
(0x5C) on MSWin32.
Deletes a list of files.
$unlink
= Esjis::
unlink
(
@list
);
$unlink
= Esjis::
unlink
(
$file
);
$unlink
= Esjis::
unlink
;
Delete a list of files. (Under Unix, it will remove a
link
to a file, but the
file may still exist
if
another
link
references it.) If list is omitted, it
unlinks the file
given
in
$_
. The subroutine returns the number of files
successfully deleted.
This subroutine function
when
the filename ends
with
chr
(0x5C) on MSWin32.
Changes the working directory.
$chdir
= Esjis::
chdir
(
$dirname
);
$chdir
= Esjis::
chdir
;
This subroutine changes the current process's working directory to
$dirname
,
if
possible. If
$dirname
is omitted,
$ENV
{
'HOME'
} is used
if
set, and
$ENV
{
'LOGDIR'
}
otherwise; these are usually the process's home directory. The subroutine returns
true on success, false otherwise (and puts the error code into $!).
chdir
(
"$prefix/lib"
) ||
die
"Can't cd to $prefix/lib: $!"
;
This subroutine
has
limitation on the MSWin32. See also BUGS AND LIMITATIONS.
Do file
$return
= Esjis::
do
(
$file
);
The
do
FILE form uses the value of FILE as a filename and executes the contents
subroutines from a Perl subroutine library, so that:
Esjis::
do
(
'stat.pl'
);
is rather like:
scalar
CORE::
eval
`cat
stat
.pl`;
# `type stat.pl` on Windows
except that Esjis::
do
is more efficient, more concise, keeps track of the current
filename
for
error messages, searches all the directories listed in the
@INC
array, and updates
%INC
if
the file is found.
It also differs in that code evaluated
with
Esjis::
do
FILE can not see lexicals in
the enclosing scope, whereas code in CORE::
eval
FILE does. It's the same, however,
in that it reparses the file every
time
you call it -- so you might not want to
do
this inside a loop
unless
the filename itself changes at
each
loop iteration.
If Esjis::
do
can't
read
the file, it returns
undef
and sets $! to the error. If
Esjis::
do
can
read
the file but can't compile it, it returns
undef
and sets an
error message in $@. If the file is successfully compiled,
do
returns the value of
the
last
expression evaluated.
Inclusion of library modules (which have a mandatory .pm suffix) is better done
an exception
if
there's a problem. They also offer other benefits: they avoid
duplicate loading, help
with
object-oriented programming, and provide hints to the
compiler on function prototypes.
But Esjis::
do
FILE is still useful
for
such things as reading program configuration
files. Manual error checking can be done this way:
# read in config files: system first, then user
for
$file
(
"/usr/share/proggie/defaults.rc"
,
"$ENV{HOME}/.someprogrc"
) {
unless
(
$return
= Esjis::
do
(
$file
)) {
warn
"couldn't parse $file: $@"
if
$@;
warn
"couldn't Esjis::do($file): $!"
unless
defined
$return
;
warn
"couldn't run $file"
unless
$return
;
}
}
A long-running daemon could periodically examine the timestamp on its configuration
file, and
if
the file
has
changed since it was
last
read
in, the daemon could
use
Esjis::
do
to reload that file. This is more tidily accomplished
with
Esjis::
do
than
with
Esjis::
require
.
Require file
Esjis::
require
(
$file
);
Esjis::
require
();
This subroutine asserts a dependency of some kind on its argument. If an argument is
not supplied,
$_
is used.
name is
given
by the
$file
. This is similar to using a Esjis::
do
on a file, except
and raises an exception
if
any difficulties are encountered. (It can thus be used
to express file dependencies without worrying about duplicate compilation.) Like
in the
@INC
array and to update
%INC
on success.
The file must
return
true as the
last
value to indicate successful execution of any
initialization code, so it
's customary to end such a file with 1 unless you'
re sure
it'll
return
true otherwise.
Current position of the readdir
$telldir
= Esjis::
telldir
(DIRHANDLE);
This subroutine returns the current position of the
readdir
routines on DIRHANDLE.
This value may be
given
to
seekdir
to access a particular location in a directory.
The subroutine
has
the same caveats about possible directory compaction as the
corresponding
system
library routine. This subroutine might not be implemented
everywhere that
readdir
is. Even
if
it is,
no
calculation may be done
with
the
return
value. It's just an opaque value, meaningful only to
seekdir
.