NAME
BATsh::SH - Pure Perl bash/sh interpreter for BATsh
SYNOPSIS
# Used internally by BATsh; not normally called directly.
# BATsh::SH implements the SH-mode interpreter invoked when BATsh
# detects a bash/sh section in a .batsh script.
# Executed via BATsh:
use BATsh;
BATsh->run_string(<<'END');
x=hello
greet() {
echo "Hello, $1 -- ${#1} chars"
}
greet world
echo ${x^^}
for i in 1 2 3; do
echo item $i
done
ls /tmp | perl -ne "print"
echo out > /tmp/out.txt
END
DESCRIPTION
Executive Summary
BATsh::SH is the sh/bash interpreter component of BATsh. It handles any script section whose first token contains a lowercase letter, executing it entirely in Pure Perl -- no external shell required. It supports pipelines (|), I/O redirection (> >> 2>&1), functions, compound commands (&&/||/;), and rich parameter expansion: ${var%pat}, ${var^^}, ${var:N:L}, ${#var}.
Mixed-Mode Sample (via BATsh)
use BATsh;
BATsh->run_string(<<'SCRIPT');
:: CMD section: uppercase first token
SET CITY=Tokyo
# SH section: lowercase first token
greet() { echo "Hello from $1!"; }
greet $CITY
echo "lower: ${CITY,,}"
echo $CITY | perl -ne "print uc"
SCRIPT
FULL DESCRIPTION
BATsh::SH implements the POSIX sh / bash command set entirely in Perl. No external sh or bash is required.
Supported Features
VAR=value, export VAR=value, unset VAR
echo, printf
if/then/elif/else/fi
for VAR in list; do ... done
while condition; do ... done
until condition; do ... done
(for/while/until accept the loop body either on following lines or fully
inline on one line, e.g. "for i in 1 2 3; do echo $i; done")
case $var in pat) ... ;; pat1|pat2) ... ;; *) ... ;; esac
(case: |-separated patterns, * ? [abc] [a-z] [!abc] globs, quoted/literal
patterns, and the bash ;& / ;;& fall-through terminators)
test / [ ... ] (file tests, string, integer comparisons)
cd, pwd, exit, true, false, :, read, shift, local, set
trap 'cmd' SIG... / trap - SIG / trap '' SIG / trap [-p]
$(( arithmetic )) -- supports $1..$9 positional params
$( command substitution ), `backtick substitution`
$VAR, ${VAR}, $1..$9, $@, $*, $#, $?, $$, $0, $!
${VAR:-default}, ${VAR:=default}, ${VAR:+alt}
${VAR%pat}, ${VAR%%pat} -- suffix removal (shortest/longest)
${VAR#pat}, ${VAR##pat} -- prefix removal (shortest/longest)
${VAR/pat/rep}, ${VAR//pat/rep} -- substitution (first/all)
${VAR^^}, ${VAR^}, ${VAR,,}, ${VAR,} -- case conversion
${VAR:offset:length}, ${VAR:offset} -- substring
${#VAR} -- string length
arr=(a b c), arr+=(d e) -- indexed array assignment / append
arr[i]=v, arr[i]+=v -- indexed element assignment / append
declare -a arr, declare -A map, typeset ... -- array declaration
map=([k1]=v1 [k2]=v2), map[k]=v -- associative array assignment
${arr[i]}, ${map[key]}, $arr (== ${arr[0]}) -- element access
${arr[@]}, ${arr[*]} -- all elements
${#arr[@]}, ${#map[@]} -- element count
${#arr[i]} -- length of one element
${!arr[@]}, ${!map[@]} -- indices / keys
unset arr, unset arr[i] -- whole array / single element
source / . file
name() { ... }, function name { ... } -- function definition
cmd1 | cmd2 [| cmd3 ...] (pipeline via temporary file)
cmd1 && cmd2 (run cmd2 only if cmd1 succeeds)
cmd1 || cmd2 (run cmd2 only if cmd1 fails)
cmd1 ; cmd2 (sequential execution)
> file, >> file, < file (I/O redirection)
2> file, 2>> file (stderr redirect)
2>&1 (merge stderr into stdout)
cmd << DELIM ... DELIM (here-document on STDIN)
cmd <<-DELIM (here-document, strip leading tabs)
cmd <<'DELIM' (here-document, no expansion)
cmd & (background execution; external commands)
echo *.txt (filename glob expansion: *, ?, [abc])
for f in *.pl; do ... (glob expansion in for-loop word list)
Variable Expansion
$VAR, ${VAR}, and positional parameters $1..$9 are expanded before each line executes. $@ and $* expand to all positional parameters space-joined; $# gives their count. The special parameters $? (last exit status), $$ (process id), $0 (script name) and $! (process id of the most recent background job, empty before any) are also expanded.
The following parameter expansion forms are supported:
${VAR:-default} value if set, else default
${VAR:=default} set and use default if unset
${VAR:+alt} alt if set, else empty
${VAR%pat} remove shortest suffix matching pat
${VAR%%pat} remove longest suffix matching pat
${VAR#pat} remove shortest prefix matching pat
${VAR##pat} remove longest prefix matching pat
${VAR/pat/rep} replace first match of pat with rep
${VAR//pat/rep} replace all matches of pat with rep
${VAR^^} convert to uppercase
${VAR^} uppercase first character
${VAR,,} convert to lowercase
${VAR,} lowercase first character
${VAR:N:L} substring from offset N, length L
${VAR:N} substring from offset N to end
${#VAR} length of value
Patterns use shell glob syntax: * matches any string, ? matches any single character, [abc] matches a character class.
Arrays and Associative Arrays
Indexed arrays are created by a parenthesised list, by an explicit element assignment, or by declare -a:
arr=(alpha beta gamma) # arr[0]=alpha arr[1]=beta arr[2]=gamma
arr[3]=delta # element assignment
arr+=(epsilon) # append at the next index
arr[0]+=X # append to one element's string value
declare -a empty # declare an empty indexed array
Associative arrays must be declared with declare -A (or typeset -A) before use, then keyed by arbitrary strings:
declare -A color
color[red]=FF0000
color=([green]=00FF00 [blue]=0000FF) # whole-array (re)assignment
Element and bulk access mirror bash:
${arr[2]} one element (indexed subscript is evaluated arithmetically)
${color[red]} one element (associative subscript is a literal string)
$arr shorthand for ${arr[0]}
${arr[-1]} negative index counts back from the last set element
${arr[@]} all element values
${arr[*]} all element values (same as [@] here)
${#arr[@]} number of set elements
${#arr[2]} length of one element's value
${!arr[@]} list of indices (indexed) or keys (associative)
unset arr removes the whole array; unset arr[i] removes one element.
Element values that contain spaces survive a quoted whole-array reference: in a for list "${arr[@]}" and "${!arr[@]}" expand to one item per element or key. Elsewhere ${arr[@]} joins elements with a single space, consistent with the word-splitting model used throughout BATsh::SH. Array names are case-insensitive (like scalar variables); a name is either a scalar or an array, never both. Element order for ${arr[@]} is ascending numeric index for indexed arrays and sorted key order for associative arrays (bash leaves associative order unspecified, so a deterministic order is used for portable output).
Case Statements
case WORD in ... esac selects a clause by matching WORD against shell glob patterns:
case $fruit in
apple) echo "an apple" ;;
pear|quince) echo "pome fruit" ;; # | separates alternatives
a*) echo "starts with a" ;; # * ? globbing
[0-9]) echo "a digit" ;; # character classes
[!aeiou]*) echo "not a vowel" ;; # [!...] negated class
*) echo "something else" ;; # default catch-all
esac
Each clause is pattern) commands TERMINATOR. Patterns are separated by |; a clause matches if any of its patterns matches the word. Pattern syntax is shell glob: * (any string), ? (any character), [abc], ranges [a-z], and negation [!abc] or [^abc]. Quoted or backslash-escaped metacharacters match literally (e.g. "*") matches a literal asterisk). *) is the conventional default clause.
Three clause terminators are supported, matching bash:
;; stop after this clause (the normal case)
;& fall through: run the NEXT clause's body unconditionally
;;& continue: keep testing the remaining patterns against the word
The construct may be written across lines or fully inline on one line (case $x in a) echo a ;; *) echo b ;; esac). A leading ( before the pattern list ((pattern)) is accepted.
Traps and Signals
trap registers a handler to run on a signal or on the EXIT pseudo-signal:
trap 'COMMANDS' SIGSPEC... run COMMANDS when each SIGSPEC fires
trap - SIGSPEC... reset to the default action
trap '' SIGSPEC... ignore the signal
trap (or trap -p) list the current traps
A SIGSPEC is a signal name with or without a leading SIG (INT, SIGINT), a signal number (2), or the EXIT pseudo-signal (also spelled 0). Real signals are bridged to Perl's %SIG: trap 'cmd' INT installs a %SIG{INT} handler that runs cmd, trap '' INT sets it to IGNORE, and trap - INT restores DEFAULT. The EXIT trap is run internally when the script finishes or when exit is called.
The handler command is stored unexpanded and expanded when it fires, so
tmp=$(mktemp); trap 'rm -f $tmp' EXIT
removes the file named by $tmp as it stood at exit. Handlers run at the next safe point after a signal is delivered. EXIT / ERR / DEBUG / RETURN are treated as pseudo-signals and never touch %SIG; of these, only EXIT currently runs a handler. Signal names unsupported by the host (common on Windows) are accepted but degrade quietly.
Function Definitions
Shell functions are defined with name() { ... } or function name { ... }. Inline single-line bodies are also supported: name() { cmd; }. Functions receive arguments as $1..$9 and $@. The caller's positional parameters are saved before the call and restored on return.
Pipeline
The | operator is supported in SH mode. The left side's standard output is written to a temporary file (File::Spec->tmpdir()), which is then fed as standard input to the right side. Multiple pipes (cmd1 | cmd2 | cmd3) are handled by chaining temporary files. All temporary files are removed after use. This implementation is Pure Perl and Perl 5.005_03 compatible.
I/O Redirection
cmd > file stdout overwrite (create or truncate)
cmd >> file stdout append
cmd < file stdin from file
cmd 2> file stderr overwrite
cmd 2>> file stderr append
cmd 2>&1 merge stderr into stdout (current stdout target)
cmd 1>&2 merge stdout into stderr
Redirections are parsed after variable expansion, so filenames may contain variables (e.g. echo text > $outfile). All file handles use bareword globs for Perl 5.005_03 compatibility.
Here-Documents
A here-document attaches the lines following the command, up to a line equal to a delimiter word, to the command's standard input:
cat <<EOF
line one
line two
EOF
Three forms are recognised:
cmd <<DELIM body is variable-expanded
cmd <<'DELIM' body is literal (no expansion); "DELIM" also works
cmd <<-DELIM leading tab characters are stripped from body and
from the line carrying the closing delimiter
When the delimiter is unquoted, each body line is expanded exactly like an ordinary SH line ($VAR, ${...}, $(...)). When the delimiter is quoted, the body is passed through verbatim.
The body is written to a uniquely named temporary file created with sysopen(...,O_CREAT|O_EXCL,...) to avoid symlink races, and that file is supplied as standard input through the same redirection path used by < file. Both built-ins (e.g. read) and external commands run via system() therefore see the body on STDIN. The temporary file is removed immediately after the command finishes, with an END block as a failsafe. This implementation is Pure Perl and Perl 5.005_03 compatible.
The closing delimiter must appear on a line by itself and match exactly (after tab stripping for <-<<->); trailing whitespace is not ignored. If no matching delimiter is found before end of input, a warning is issued and $? is set to a non-zero value.
Here-Document Limitations
The following are not supported in this release and are documented as known limitations:
Here-documents are recognised only in SH mode. The
<<sequence has no special meaning in CMD mode (BATsh::CMD) and is left untouched there.Only a single here-document per command line is handled. Multiple here-documents on one line (
cmd <<A <<B) are not supported.Here-strings (
<<< word) are not supported;<<<is deliberately not treated as a here-document opener.Combining a here-document with a pipeline or compound operator on the same line (e.g.
cmd <<EOF | other) is best-effort only and not guaranteed; use a separate command for portable behaviour.The delimiter word is matched literally; the
<<"a b"form with an embedded space in the delimiter is not supported.A here-document body line that looks like a BATsh subroutine marker (a line of the form
:LABELlater followed byRET/RETURN) may be consumed by subroutine extraction, which runs before mode dispatch. Avoid such lines inside here-document bodies.
Background Execution
An unquoted & at the very end of an SH command line starts the command asynchronously and returns control immediately, in the style of POSIX shells:
longjob &
echo "next prompt"
Only the single & at the end of the line is consumed. An & that is part of &&, of an fd-duplication such as 2>&1 or 1>&2, inside single or double quotes, or backslash-escaped (\&) is not treated as a background operator and is left in place.
The launch is Pure Perl and Perl 5.005_03 compatible, with a portable split by platform:
On Win32, the command is spawned through the command shell with
system(1, ...)(P_NOWAIT), which returns the process id directly.On Unix-like systems the command is started by delegating to /bin/sh (no Perl
forkis used), and the background job's process id is captured through the shell's own$!into a uniquely named temporary file created withsysopen(...,O_CREAT|O_EXCL,...). The temporary file is removed immediately, with anENDblock as a failsafe.
On a successful launch $? is set to 0; the exit status of the background job itself is not awaited. The process id of the most recently started background job is available through $!, which expands to the empty string before any background job has been started.
Background Execution Limitations
The following are not supported in this release and are documented as known limitations:
Background execution applies only to external commands in SH mode. A trailing
&on a built-in, a defined function, a variable assignment, or a control keyword is ignored and the command runs in the foreground. In CMD mode (BATsh::CMD)&keeps its cmd.exe meaning as a sequential command separator and is unchanged.Only a trailing
&is recognised. A mid-line&that backgrounds part of a line (e.g.a & b) is not supported; writeaon its own line with a trailing&instead.There is no job control:
jobs,wait,wait %n,fg,bgand job-specification (%n) syntax are not implemented. Signals are not delivered to background jobs by BATsh.A backgrounded pipeline or compound list is delegated as a unit to the underlying OS shell; BATsh expands variables and command substitutions first, then hands the resulting line to that shell, so redirections and operators inside a backgrounded line follow OS-shell rules rather than BATsh's own redirection engine.
When the command word is supplied by a variable (e.g.
$CMD &), the foreground/background decision is made on the literal first token before expansion; such lines are treated as external and backgrounded.
Compound Commands
cmd1 && cmd2 run cmd2 only if cmd1 exits with status 0
cmd1 || cmd2 run cmd2 only if cmd1 exits with non-zero status
cmd1 ; cmd2 run cmd2 unconditionally after cmd1
These are detected before variable expansion to ensure short-circuit logic works correctly. Quoting (', ") and $(...) nesting are respected when splitting.
Function Definitions
name() { body }
function name { body }
name() { cmd1; cmd2; } # inline single-line body
Functions are registered in a package-level hash %_SH_FUNCTIONS. The caller's positional parameters ($1..$9, $*) are saved before the call and restored on return. local VAR=value saves the existing value of VAR in the function's stack frame and restores it on return.
AUTHOR
INABA Hitoshi <ina.cpan@gmail.com>
LICENSE
Same as Perl itself.