The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Emacs::Lisp - Support for writing Emacs extensions in Perl

SYNOPSIS

In Emacs

    M-x perl-eval-expression RET 2+2 RET
    M-x perl-eval-region RET
    M-x perl-eval-buffer RET
    ... and more ...

In Perl

    use Emacs::Lisp;

    &switch_to_buffer('*scratch*');
    &insert("Hello, world!\n");

    setq { $cperl_font_lock = t };

    &add_hook(\*find_file_hooks,
              sub { &message("found a file!") });

    use Emacs::Lisp qw($emacs_version $epl_version);
    save_excursion {
        &set_buffer(&get_buffer_create("*test*"));
        &insert("This is ");
        &insert(&featurep(\*::xemacs) ? "XEmacs" : "Emacs");
        &insert(" version $emacs_version,\n");
        &insert("EPL version $epl_version.\n");
        &insert("Emacs::Lisp version is $Emacs::Lisp::VERSION.\n");
    };

DESCRIPTION

Emacs allows you to customize your environment using Lisp. With EPL, you can use Perl, too. This module allows Perl code to call functions and access variables of Lisp.

You still need to learn some Lisp in order to understand The Elisp Manual, which is the definitive reference for Emacs programming. This document assumes a basic understanding of Emacs commands and Lisp data types. I also assume familiarity with Perl's complex data structures (described in perlref) and objects (see perlobj).

Quick Start

Run emacs -l perl and type:

    C-x p e &insert ("hello!\n") RET

The string "hello!" should appear in your scratch buffer. The Perl sub &insert has called the Emacs Lisp insert function, which inserts its string argument into the current buffer at point.

Paste this text into a buffer, select it, and type M-x perl-eval-region RET:

    sub doit { &message("Cool, huh?"); }
    defun (\*perltest, interactive, \&doit);

Type M-x perltest RET. The text will appear in the minibuffer. defun and interactive are used to create Emacs commands.

EPL AND PERLMACS

Perlmacs was (is?) a project that embedded a Perl interpreter into the Emacs binary so that it could run Lisp, Perl, or any combination of the two. It uses Perl's C interface, which requires patching and recompiling Emacs. As a result, each release is tied to a version of Emacs, it takes a lot of time and disk space to build, and it is not very portable.

EPL (Emacs Perl) accomplishes most of what Perlmacs can do, but it does not suffer from the same drawbacks. It uses unmodified Emacs and Perl and lets them work together through IPC (pipes). This may make some tasks much slower, but it is much more convenient to install and upgrade, and it works with XEmacs as well as Emacs 21 betas.

For the time being, this module attempts to support both Perlmacs and EPL. The user-visible APIs are almost identical, except for EPL's lack of Emacs::main().

LISP SUPPORT FOR PERL

Lisp code can check for Perl support using (require 'perl). In Perlmacs, some of the Perl functions are built in, and others are defined in perl.el. When you use EPL, epl.el substitutes for the built-in support, but the same perl.el is used.

Functions

The following Lisp functions do not rely on the Emacs::Lisp module. Use C-h f <function-name> RET within Emacs to see their doc strings.

    perl-eval-expression  EXPRESSION
    perl-eval-region      START END
    perl-eval-buffer
    perl-load-file        NAME
    perl-eval             STRING &optional CONTEXT
    perl-call             SUB &optional CONTEXT &rest ARGS
    perl-eval-and-call    STRING &optional CONTEXT &rest ARGS
    perl-to-lisp          OBJECT
    perl-wrap             OBJECT
    perl-value-p          OBJECT
    perl-eval-raw         STRING &optional CONTEXT
    perl-call-raw         SUB &optional CONTEXT &rest ARGS
    make-perl-interpreter &rest ARGV
    perl-destruct         &optional INTERPRETER
    perl-gc               &optional PURGE
    perl-free-refs        &rest REFS

The following Lisp variables affect the Perl interpreter and have doc strings accessible via C-h f <variable-name> RET. They are:

    perl-interpreter-program
    perl-interpreter-args
    perl-interpreter

Data Conversions

When Perl calls a Lisp function, its arguments are converted to Lisp objects, and the returned object is converted to a Perl value. Likewise, when Lisp calls Perl, the arguments are converted from Lisp to Perl and the return values are converted to Lisp.

  • Lisp has three scalar types.

    Lisp integers, floats, and strings all become Perl scalars. A simple Perl scalar becomes either an integer, a float, or a string.

    Interesting character encodings such as UTF-8 are not currently supported. I don't even know what happens to 8-bit characters during string conversion.

  • Lisp symbols correspond to globrefs.

    Glob references become symbols in Lisp. Underscores are swapped with hyphens in the name, since Perl prefers underscores and Lisp prefers hyphens. See "Symbols" for more information.

  • Lisp's `nil' is equivalent to Perl's `undef' or `()'.

    As an exception to the rule for symbols, nil in Lisp corresponds to undef in Perl.

    In Lisp, nil is really a symbol. However, it is typically used as the boolean value false. Glob references evaluate to true in boolean context. It is much more natural to convert nil to undef.

  • Arrayrefs correspond to lists.

    Lists are a central data structure in Lisp. To make it as easy as possible to pass lists to Lisp functions that require them, Perl array references are converted Lisp lists. For example, the Perl expression such as

        ["x", ["y", 1]]

    is converted to

        '("x" ("y" 1))

    in Lisp.

  • Arrayref refs correspond to vectors.

    Adding \ to an arrayref makea it an arrayref ref, which becomes a vector in Lisp. For example, \[1, 2, undef] becomes [1 2 nil].

  • Conses that are not lists become Emacs::Lisp::Cons objects.

    Compatibility note: Perlmacs does not have this feature.

        $x = &cons("left", "right");
        print ref($x);                # "Emacs::Lisp::Cons"
        print $x->car;                # "left"
        print $x->cdr;                # "right"

    But:

        $x = &cons ("top", undef);    # a Lisp list
        print ref($x);                # "ARRAY"
        print $x->[0];                # "top"
  • Conversion uses "deep" copying by default.

    Conversion of lists and vectors to arrayrefs and arrayref refs is recursive by default. Changes made by Lisp to a list will not affect the Perl array of which it is a copy, nor will changes to a Perl array affect a Lisp list. See "BUGS" about converting cyclic structures.

  • There are ways to make "shallow" copies.

    A shallow copy simply wraps a Perl scalar in a Lisp object or vice versa. Wrapped Perl values appear as a Lisp objects of type perl-value. Wrapped Lisp values appear in Perl as objects of class Emacs::Lisp::Object. See "CAVEATS" for issues relating to wrapped data.

    Where a data type has no natural equivalent in the other language, shallow copying is the default. Examples include Perl hashrefs and Lisp buffer objects.

    In Perl, the lisp function wraps its argument in a Lisp object. This allows Perl arrays to be passed by reference to Lisp functions. (Of course, the value returned by lisp is really a Perl value wrapped in a Lisp object wrapped in a Perl object.)

    An Emacs::Lisp::Object's to_perl method performs a deep copy (if the argument is Lisp data) or unwraps its argument (if it is Perl data).

    Lisp functions called through package Emacs::Lisp convert their return values using deep copying. The same functions are accessible through Emacs::Lisp::Object, which does shallow conversion and always returns an Emacs::Lisp::Object object.

    These examples show how the data wrapping functions work:

        $x = lisp [1, 2, 3];
        print ref($x);           # "Emacs::Lisp::Object"
        print ref($x->to_perl);  # "ARRAY"
        print @{&list(2, 3)};    # "23"
    
        $x = Emacs::Lisp::Object::list(2, 3);
        print ref($x);           # "Emacs::Lisp::Object"
        print @{$x->to_perl};    # "23"

Scripts

Perlmacs can run Perl programs. By default, Perlmacs is installed under two names, pmacs and perlmacs. Which name is used to invoke the program determines how it parses its command line.

If perlmacs is used (or, more precisely, any name containing "perl"), it behaves like Perl. For example,

    $ perlmacs script.pl

runs the Perl program script.pl.

When invoked as pmacs, it behaves like Emacs. Example:

    $ pmacs file.txt

This begins an editing session with file.txt in the current buffer.

The first command line argument can override the invocation name. If it is --emacs, Emacs takes control. If it is --perl, the program runs in Perl mode.

The Emacs module (that is, the Perl module named "Emacs") includes support for starting an editing session from within a Perlmacs script. See Emacs.

PERL SUPPORT FOR LISP

The Emacs::Lisp module allows Perl programs to invoke Lisp functions and handle Lisp variables as if they were Perl subs and variables.

The directive use Emacs::Lisp; causes any use of a function not defined in Perl to invoke the Lisp function of the same name (with hyphens in place of underscores). For example, this writes a message to the standard error stream (in Perl mode) or displays it in the minibuffer:

    &message ("this is a test");

Functions

This code calls the hypothetical Lisp function foo-bar with arguments 4 and t.

    &foo_bar(4, t);

The Lisp syntax for the same call would be

    (foo-bar 4 t)

The ampersand (&) in the Perl example is not required, but it is needed for functions, such as read, eval, and print, which are Perl keywords. Using it with Emacs::Lisp is a good habit, so the examples in this document include it.

If you don't want an AUTOLOAD sub to affect your namespace, you may either put parentheses after "use Emacs::Lisp" or import to a different package, and use qualified function names. For example:

    use Emacs::Lisp ();
    Emacs::Lisp::message("hello\n");

    {package L; use Emacs::Lisp;}
    L::message("goodbye\n");

Symbols

Many Lisp functions take arguments that may be, or are required to be, symbols. In Lisp, a symbol is a kind of name, but does not have the same type as a string. Lisp programs typically use the quote operator to specify a symbol. For example, this Lisp code refers to the beep symbol:

    (run-at-time nil 1 'beep)

EPL uses glob references to specify symbols. A literal globref begins with a backslash followed by an asterisk, so the last example would be written as

    &run_at_time(undef, 1, \*beep);

in Perl. (You may want to do &cancel_function_timers(\*beep) soon after trying this example.)

When comparing the returned values of Lisp functions to each other and to symbols, it is best to use the Lisp eq function instead of Perl's equality operators.

    ### PREFERRED
    if (&eq(&type_of($x), \*::cons)) { ... }

    ### PROBABLY OK
    if (&type_of($x) eq \*cons) { ... }
    if (&type_of($x) == \*cons) { ... }

Variables

In Lisp, variables play a role akin to that of Perl scalar variables. A variable may hold a number, a string, or a reference to any type of complex Lisp data structure. (They are not called references in Lisp, but rather "objects".)

You can create a Perl alias for any reasonably named Lisp variable by saying use Emacs::Lisp qw($varname);. Thereafter, assignment to $varname will update the Lisp value. Changes made to the variable in Lisp will be reflected in Perl when $varname is used in expressions.

This example saves and replaces the value of the Lisp variable inhibit-eol-conversion:

    use Emacs::Lisp qw($inhibit_eol_conversion);
    $old_val = $inhibit_eol_conversion;
    $inhibit_eol_conversion = 1;

This sort of thing could be accomplished in Lisp as follows:

    (setq old-val inhibit-eol-conversion)
    (setq inhibit-eol-conversion 1)

(but you would probably rather use let instead, for which there is still no convenient Emacs::Lisp equivalent). See also the setq function below.

Property Lists

Lisp symbols all have an associated object called a plist, for "property list". The plist is an object just like any other, but it is typically used in a way vaguely resembling Perl's hashes.

Plists are not used nearly as often as Lisp functions and variables. If you are new to Lisp, you can probably skip this section.

A plist is different from a Perl hash. Lookups are not based on string equality as with Perl, but rather on Lisp object equality of the eq variety. For this reason, it is best to stick to the Lisp convention of using only symbols as keys. (See "Symbols".)

Emacs::Lisp provides a shorthand notation for getting and setting plist elements. If you say "use Emacs::Lisp qw(%any_name)", then subsequent access to the elements of %any_name will get or set the corresponding properties of the Lisp symbol any-name.

For example, the following Perl and Lisp fragments are more or less equivalent:

    # Perl fragment
    use Emacs::Lisp qw(%booboo %upcase_region);
    $booboo{\*error_conditions} = [\*booboo, \*error];
    $can_upcase = ! $upcase_region{\*disabled};

    ; Lisp fragment
    (put 'booboo 'error-conditions '(booboo error))
    (setq can-upcase (not (get 'upcase-region 'disabled)))

See also the setq function below.

Macros

Lisp macros, such as setq and defun, do not work the same way functions do, although they are invoked using the function syntax. (Here you see the vast philosophical chasm separating Perl from Lisp. While Perl might have five syntaxes to mean the same thing, Lisp has one syntax with two meanings!)

Some macros are equivalent to Perl operators, such as if and while. Others have meanings peculiar to Lisp. A few macros are implemented in Emacs::Lisp. They are described below. If you try to call a macro that has not been implemented, you will get an error message which may propose an alternative.

catch SYMBOL,CODE

Evaluate CODE in a Lisp catch construct. At any point during CODE's execution, the throw function may be used to return control to the end of the catch block. For example:

    $x = catch \*::out, sub {
        $y = 1;
        &throw(\*::out, 16);
        $y = 2;
    };
    print $x;  # prints 16
    print $y;  # prints 1

Some Perl constructs have functionality similar to throw; for example, return and last. However, they do not work with catches in Lisp code.

defun SYMBOL,DOCSTRING,SPEC,CODE
defun SYMBOL,DOCSTRING,CODE
defun SYMBOL,SPEC,CODE
defun SYMBOL,CODE

Make CODE callable as the Lisp function SYMBOL. This is Lisp's version of Perl's sub keyword. A function defined in this way becomes visible to Lisp code.

defun is useful for defining Emacs commands. Commands are functions that the user can invoke by typing M-x <function-name>. A command may be bound to a key or sequence of keystrokes. See the Emacs documentation for specifics.

When defining a command, you must specify the interactive nature of the command. There are various codes to indicate that the command acts on the current region, a file name to be read from the minibuffer, etc. Please see The Elisp Manual for details.

Emacs::Lisp's defun uses a SPEC returned by the interactive function to specify a command's interactivity. If no SPEC is given, the function will still be callable by Lisp, but it will not be available to the user via M-x <function-name> RET and cannot be bound to a sequence of keystrokes. Even commands that do not request information from the user need an interactive spec. See "interactive".

This example creates a command, reverse-region-words, that replaces a region of text with the same text after reversing the order of words. To be user-friendly, we'll provide a documentation string, which will be accessible through the Emacs help system (C-h f reverse-region-words RET).

    use Emacs::Lisp;
    defun (\*reverse_region_words,
           "Reverse the order of the words in the region.",
           interactive("r"),
           sub {
               my ($start, $end) = @_;
               my $text = &buffer_substring($start, $end);
               $text = join('', reverse split (/(\s+)/, $text));
               &delete_region($start, $end);
               &insert($text);
           });

If you try this example and invoke the help system, you may notice something not quite right in the message. It reads as follows:

    reverse-region-words is an interactive Lisp function.
    (reverse-region-words &optional START END &rest ARGS)

    Reverse the order of the words in the region.

Notice the part about "&optional" and "&rest". This means that Lisp thinks the function accepts any number of arguments. It knows the names of the first two because of the assignment "my ($start, $end) = @_".

But our function only works if it receives two args. Specifying a prototype documents this:

    sub ($$) {
        my ($start, $end) = @_;
        ...
    }

    reverse-region-words is an interactive Lisp function.
    (reverse-region-words START END)
interactive SPEC
interactive

Used to generate the third (or, in the absence of a doc string, the second) argument to defun. This determines how a command's arguments are obtained.

What distinguishes a "command" from an ordinary function in Emacs is the presence of an interactive specifier in the defun expression.

SPEC may be a string, as described in The Elisp Manual, or a reference to code which returns the argument list. If no spec is given, the command runs without user input.

save_excursion BLOCK

Execute BLOCK within a Lisp save-excursion construct. This restores the current buffer and other settings to their original values after the code has completed. See The Elisp Manual for details.

setq BLOCK

BLOCK is searched for assignments of either of these forms:

    $var = EXPR;
    $hash{$key} = EXPR;

Every such $var and %hash is imported from the Emacs::Lisp module as if you had said, "use Emacs::Lisp qw($var %hash)". Afterwards, BLOCK is executed. This is a convenient way to assign to variables, for example in customization code.

This code

    use Emacs::Lisp;
    setq {
        $A = 2*$foo[5];
        $B{\*foo} = "more than $A";
    };

would have exactly the same effect as this:

    use Emacs::Lisp qw(:DEFAULT $A %B);
    $A = 2*$foo[5];
    $B{\*foo} = "more than $A";

The following, which does not tie or import any variables, has the same effect on Lisp as the above:

    use Emacs::Lisp ();
    Emacs::Lisp::set( \*A, 2*$foo[5] );
    Emacs::Lisp::put( \*B, \*foo, "more than "
      . &Emacs::Lisp::symbol_value( \*A ));
unwind_protect (BODY, HANDLER)

Execute coderef BODY, returning its result. Execute coderef HANDLER after BODY finishes, even if BODY exits nonlocally through die or the like.

BUGS

These are some of the known bugs in EPL and Emacs::Lisp. If you find other bugs, please check that you have the latest version, and email me.

  • Emacs::Lisp doesn't work outside of XEmacs.

    If a Perl program not under the control of an Emacs process uses Emacs::Lisp functions, Emacs::Lisp tries to run Emacs in batch mode. This only works with GNU Emacs 21 beta, not Emacs 20 or XEmacs. This can probably be fixed, but I don't know what the problem is yet.

    A real solution would involve talking to Emacs on a channel other than its standard input and output. This might allow one to run in interactive mode with arbitrary command line options. I don't know if Emacs can use arbitrary file descriptors or named pipes. I suspect not. If not, I guess I'll try inet sockets. Other possibilities would be ptys (Emacs loves them, I'm not overly fond) and an intermediary perl process that talks to the original process over a named pipe.

  • Non-robust with respect to subprocess Perl dying.

    Perl dies because of (e.g.) version mismatch between epl.el and EPL.pm. Then you can't exit Emacs, because it tries to tell Perl to exit and gives you an error "Process perl not running". Very unfriendly.

  • Within Lisp code, everything defaults to package `main'.

    It would perhaps be best to give the Lisp evaluation environment the notion of a "current package" such as Perl has.

  • Symbols whose names contain :: or '

    How can we convert them to and from Perl?

  • High IPC overhead

    Strings are copied more than they absolutely need to be. Even if they weren't, it's bound to be a lot slower than Perlmacs.

  • Lisp hash tables are not deep-copied.

    What to do? Produce tied hashes whose keys can be any Lisp object? Wrap hashes that contain non-string keys?

  • XEmacs package autoloads commands but not key bindings.

    I need to figure out how to do this.

CAVEATS

  • Conversion of scalar types is uncertain.

    A defined, non-reference Perl scalar converted to Lisp becomes either an integer, a float, or a string. The method of choice is unclear. This could be considered a bug, but it is somewhat inherent in the languages' semantics, as Perl has no really good way to distinguish a number from an equivalent string or an integer from a float.

  • Conversion is not always reversible.

    Information may be lost through the default (``deep'') data conversion process. For example, the glob reference \*::nil and an empty arrayref both become undef when converted to Lisp and back. Perl and Emacs support different ranges for integer values. Integers that don't fit are upgraded to floats, so the distinction is lost.

  • Circular data structures are troublesome.

    See "Two-Phased Garbage Collection" in perlobj. Lisp data structures may be recursive (contain references to themselves) without the danger of a memory leak, because Lisp uses a periodic-mark-and-sweep garbage collector.

    However, if a recursive structure involves any Perl references, it may never be destroyable.

    For best results, Perl code should handle mainly Perl data, and Lisp code should handle mainly Lisp data.

  • Cross-language references incur overhead.

    For the benefit of Lisp's garbage collection, all Perl data that is referenced by Lisp participates in mark-and-sweep. For the benefit of Perl's garbage collection, all Lisp objects that are referenced by Perl maintain a (kind of) reference count.

    A chain of Perl -> Lisp -> ... -> Perl references may take several garbage collection cycles to be freed. It is therefore probably best to keep the number and complexity of such references to a minimum.

    To make matters worse, if Emacs does not support weak hash tables, Lisp must explicitly free its references to Perl data. GNU Emacs 20 does not support weak hash tables, but Perlmacs solves this problem by adding necessary support. XEmacs 21 has weak hash tables, but EPL does not yet know how to use them.

TO DO

  • Finish texinfo doc

  • Delete/revise obsolete portions of POD

  • Figure out how to handle hash tables

  • Garbage collection for XEmacs

  • Debian package target

  • Overload Emacs::Lisp::Object in various ways

  • Formal rules for scalar type conversion

  • Regression-test multiple Emacses under Perl

  • Regression-test any Perls under Emacs

  • Steal from IPC::Open2

  • Optimized regex find and replace functions

  • Multibyte characters

    Emacs has had them for some time. Now Perl's UTF-8 support is stabilizing. It's time the two met.

  • Special forms: let, defmacro, defvar.

  • Make a way to get a tied filehandle that reads a buffer.

  • Improve perl-eval-buffer, perl-eval-and-call, et al.

COPYRIGHT

Copyright (C) 1998-2001 by John Tobey, jtobey@john-edwin-tobey.org. All rights reserved.

  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation; either version 2 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful, but
  WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program; see the file COPYING.  If not, write to the
  Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
  MA 02111-1307  USA

SEE ALSO

perl, perlref, perlobj, Emacs, emacs, and The Elisp Manual (available where you got the Emacs source, or from ftp://ftp.gnu.org/pub/gnu/emacs/).