The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.


MIDI::Trans - Perl extension for quick and easy text->midi conversion.


  use MIDI::Trans;
  my $TranObj = MIDI::Trans->new( {
        'Delimiter' => '\s+',
        'Note' => \&note,
        'Volume' => sub { return(127); },
        'Duration' => \&somesub,
        'Tempo' => sub { ... }
        } );

  if($TransObj->trans( { 'File' => 'text.txt', 'Outfile' => 'out.mid' })) {
    } else {
        my $error = $TransObj->error();
        die("ERR: $error\n");

  sub note {
    # do something
    # return a value between 0 and 127
    # or string 'rest' (sans quotes) for
    # a rest event
  sub duration {
    # return some number of quarter notes


MIDI::Trans serves as a quick development foundation for text to midi conversion algorithms utilizing MIDI::Simple for output. Using MIDI::Trans, you create callbacks for generating note, volume, duration and tempo values. As your corpus is read, these callbacks are utilized to generate your midi score by MIDI::Trans. MIDI::Trans is modelled after the text conversion aspects of TransMid (, but designed to be more useful to a wider range of tasks, with less overhead.

If you're in a big hurry, and haven't any need for great control over the process, simply read the 'Plug and Play Usage' and 'CallBacks' sections below to get a jump, and your converter implemented in a just a few minutes, with just a few statements.

A corpus can be defined as either a string, or text file. MIDI::Trans will then split that corpus into elements based on an element delimiter, provided via argument, and determine some attributes of the corpus based on other data, which can be supplied by the developer. The corpus is then processed, element by element through the use of CallBacks you specify. The normal flow of development looks like this:

    Define Parameters For Conversion
    Define Functions To Generate Note, Duration, Volume, Tempo
        from parameters and element values.
    Specify and process corpus
    Create score
    Write Output

... this section not yet complete...


None by default.


A MIDI::Trans converter can be written in as few as three statements:

    use MIDI::Trans;
    my $TransObj = MIDI::Trans->new( {
        'Tempo' => 140,
        'VolumeCallBack' => sub { return(127); },
        'NoteCallBack' => sub { $cnt++; return $cnt % 4 ? 88 : 'rest'; },
        'DurationCallBack' => sub { return(16); },
    $TransObj->trans({ 'File' => './test.txt', 'Outfile' => 'test.mid' });

Obviously, this isn''t very functional and the more compact we make our code, the less functionality there is available to us.

However, if your conversion process doesn''t rely too heavily on controlling the the act of conversion its self, this single method will do everything you need for the process of conversion.

Let's discuss what we've done here:

    The new() method is called, which initializes the parameters
    for the converter.  The object returned makes its self
    available to the callbacks, assuming that you''ve created the
    variable referencing the object in the same namespace as, or at
    least scoped as visible to, the callbacks'' function definition.
    The trans() method acts as a wrapper around the step by step
    process of converting the document.  It doesn''t give you the
    ability to control some of the information gathering aspects,
    nor does it let you handle more than one corpus with a single
    object, but what it lacks in functionality, it makes up for in



    Has one argument, which is required: either a reference to, or
    an anonymous hash.  This hash contains information required to
    perform the conversion.  If any values have already been defined
    via the new() method, they do not have to be re-defined here.
    Some names are short-hand for the configuration keys of new(),
    marked with an asterisk (*), they are otherwise the same.
    trans() spawns a new instance of MIDI::Trans, this object
    must be used for attribute and configuration methods for
    the operation being performed with trans().  That is to
    say, that if you are using trans(), you must also use the
    trans_obj() method (see below) to return the object operating.
    Returns true (1) on success or sets the error() message then
    returns undef otherwise.
    The following keys are valid for the hash:
            The file path to read as the corpus.  This
            key is required.
            The file to save MIDI output to.
            Default value is './out.midi'
            The element delimiter in the corpus.
            Default value is '\s+'
            The tempo to use for the score,
            you can specify either 'Tempo' or
            'TempoCallBack' keys, but one
            must be specified.  Will override
            the value of 'TempoCallBack'.
            Subroutine reference, or anonymous
            sub block that will return a tempo
            value.  See CallBacks section below.
            Subroutine reference, or anonymous
            sub block that will return a valid
            volume value.
            Subroutine reference, or anonymous
            sub block that will return a valid
            note value.

            Subroutine reference, or anonymous
            sub block that will return a valid
            duration value.

        if( $TransObj->trans( {
                'File' => './test.txt',
                'Volume' => sub { ... },
                'Note' => sub { ... },
                'Duration' => \&some_sub,
                'Tempo' => 120 } )
                ) {
                # do something
                } else {
                    my $errmsg = $TransObj->error();
        my %hash = (
                'File' => './test.txt',
                'Volume' => sub { ... },
                'Note' => sub { ... },
                'Duration' => \&some_sub,
                'Tempo' => 120
        if( $TransObj->trans(\%hash) ) { ... }
        Both methods are equivalent.

    See the CallBacks section, below, for more information
    about the CallBacks.


Given that you need to have an active MIDI::Trans object to use the attribute and information methods, and that trans() creates a new MIDI::Trans object, the trans_obj() method has been provided to you for accessing the usually-needed methods.


    Returns the blessed object being utilized by the
    trans() method, all methods are available to this
    object, but with data specific to the current
    trans() object.

        if( $TransObj->trans( { 'TempoCallBack' => \&tempo, ... } ) ) {
        sub tempo {
            # a callback called from trans()
         my $cur_obj = $TransObj->trans_obj();
         my $num_sent = $cur_obj->sentences();



If MIDI::Trans is like the skeleton for your conversion application, then the CallBacks you define act as the nervous system. The real logic lies in the combination of information and statistics generated by the corpus, your use of configurable options, and the callbacks you define.


CallBacks must be passed as either references or anonymous sub blocks. The following forms are all valid: (examples use the callback configuration methods)

    $TransObj->volume_callback( sub { ... } );

    sub some_sub {

    $hashRef->{'key'} = sub { ... };

    my $sub = sub {


Most CallBacks are passed two arguments:

    The Current Element
    The Current Position

The Tempo CallBack is passed no arguments.

Each CallBack is expected to return a spefic type and range of data as a return value. Each type is discussed here. Please note that these are just examples and in no way reflect the complexity or interaction available to you.

Volume CallBacks

    Volume CallBacks return a numeric value to
    represent the absolute volume of the
    current element in a range of 0-127.

    Volume CallBacks are called once
    every element, after Note CallBacks.
    sub VolCallBack {

     my $elem = shift;
     my $enum = shift;

        # in this callback, volume is
        # determined by measuring the
        # length of the input, then
        # comparing that against a
        # constant value, and using that
        # comparison as a multiplier against
        # our maximum volume level.
     my $lpct = length($elem) / 24;
     $lpct = 1 if($lpct > 1);
        # here, we use the round() method supplied
        # by MIDI::Trans
     my $value = $TransObj->round(127 * $lpct);

Note CallBacks

    Note CallBacks return a scalar value to
    represent a note or rest event.  The value
    of the event is either a number, in the case
    of a note, or a string - 'rest', in the case
    of a rest event.  For a note event, you
    must specify the absolute value of the
    note as an integer in the range of 0-127.
    For a rest event, simply return a string with
    the value 'rest'.

    Note CallBacks are called once every element.
    They are processed before Volume and Duration.
    sub NoteCallBack {
     my $elem = shift;
     my $enum = shift;
     my $return;
        # here, if the corpus contains
        # an element with the string 'Eighty-
        # Eight', then a note value of 88
        # will be returned, rest otherwise.
     if($elem =~ /Eighty-Eight/) {
        $return = 88;
        } else {
            $return = 'rest';


Duration CallBacks

    Duration CallBacks return a numeric value
    to represent the duration of the current event,
    in quarter notes.  So the actual duration of
    the event, in seconds, is determined by the value of
    the qn_len() configuration method and the tempo
    returned by your Tempo CallBack.
    This method is called once every element, after all

    sub DurCallBack {
     my $elem = shift;
     my $enum = shift;


Tempo CallBacks

    Tempo CallBacks return a numeric value
    to represent the number of quarter notes
    per minute to be used in the score.  The
    actual 'tempo' supplied to MIDI::Simple is
    the result of the following equation:
    round($base_ms / $tempo);
    Where round() is the included round()
    method, $base_ms is the value of the
    configuration attribute base_milliseconds(),
    and $tempo is the tempo returned by your
    Tempo CallBack.
    Tempo is called once per processing
    a corpus, before all other CallBacks.

    Please note, that there are no arguments
    to this CallBack, as it is executed BEFORE
    any elements are processed.
    sub TempoCallBack {

     my $num_sents = $TransObj->sentences();
     my $num_words = $TransObj->words();
        # this CallBack utilizes Attribute
        # Retrieval methods to determine
        # num of words and sentences in the
        # corpus, then uses these values
        # to form a percentage of a constant
        # maximum tempo.
     my $w_to_s_pct = $num_sents / $num_words;
     my $max_tempo = 200;
     my $tempo = $TransObj->round($max_tempo * $w_to_s_pct);


Several Attributes of your corpus may be gleaned when read. This is controlled by the 'AllAttributes' configuration value, set by either new() or the configuration method all_attributes(). Currently, those attributes are :

    # of Sentences *
    # of Words *
    # of Elements
    (Those marked by an asterisk can be
    turned off to reduce memory consumption)

The Sentence Delimiter can be defined as a configuration value. The word boundary may not.


    The following methods retrive attributes
    about the corpus being processed.  They
    can only be used inside of your CallBacks,
    they are not available elsewhere.
        Returns the number of sentences
        in the corpus.
        Returns the number of words in
        the corpus.
        Returns the number of total elements
        in the corpus.



    Creates a new instance of the class.  Returns a blessed object
    on success, undef on error.  One argument is allowed, a hash
    reference or anonymous hash, which contains configuration
    information for the object.
    The following keys are allowable in the hash, and their values:

                Boolean, die() on any error with message
                0 = false (default), 1 = true
                Default delimiter used to seperate elements from
                the corpus.  Should be a valid regular expression
                as would fit in (?:).
                Default value is '\s+'
                Default delimiter for end of sentence.  Follows
                same rules as ElementDelimiter.
                Default value is '\.|\?|\!'
                Default callback for obtaining note values.
                Should be a reference to, or anonymous, sub
                routine.  See the section regarding CallBacks
                Default value is undef
                Default CallBack for obtaining volume values.
                Default CallBack for obtaining duration values.
                Default CallBack for obtaining tempo values.
                Default Channel for MIDI output.
                Default value is '1'
                Default number of ticks per quarter note.
                It is safe to leave this unmodified.
                See MIDI::Simple for more information.
                Default value is '96'
                Boolean, whether or not all attributes of
                the corpus should be measured when reading
                it.  Can be used to lessen memory usage.
                See the Attributes section below for more
                0 = False, 1 = True (default)
                The base number of ms in a minute.  This
                is used for timing and tempo purposes.
                It is safe to leave this value unmodified.
                Default value is '60000000'

        my $TransObj = MIDI::Trans->new( {
                        'VolumeCallBack' => \&vol,
                        'AllAttributes' => 0

        my %attrs = ( 'AllAttributes' => 1, 'VolumeCallBack' => \&vol );
        my $TranObj = MIDI::Trans->new(\%attrs);


    Returns the last set error message, or undef if no error
    message has been set.

        my $errmsg = $TransObj->error();



    Removes the current error message.  Causes error() to return
    undef.  Always returns true (1).



    The 'wrapper' function for quick use of MIDI::Trans, for more
    information, see the section entitled Plug and Play Usage


    See the section entitled Plug and Play Usage above.

read_corpus HASHREF

    Reads, parses, and collects attributes about a given corpus
    (your input data).  The corpus may be specified either as a
    file to read, or a string to parse. Returns true (1) on sucess,
    and sets the error message then returns undef on error.
    More than one corpus may be open at a given time.
    A single argument must be specified, which is either a hash
    reference or an anonymous hash.  The hash contains information
    about the corpus.  Three keys are possible:
            The 'handy' name, or name you wish to
            specify for the corpus.  This is useful
            when opening more than one corpus that
            is a string type.
            If the corpus type is a string, the
            name will default to 'String', otherwise
            the name will default to the file name.
            When this key is provided, it specifies
            that the corpus type is a file.  This
            key will override the 'String' key,
            even if the value is undef -- resulting
            in an error.  The value for this key
            should be the path to the file you
            want to parse.
            When this key is provided, it specifies
            that the corpus type is a string.  This
            key is overriden by the 'File' key.  The
            string should be passed as the value.
        if( $TransObj->read_corpus({ 'File' => './corpus.txt' }) ) {
            } else {
                my $error = $TransObj->error();

        my %corp_dat = ( 'File' => './corpus.txt', 'Name' => 'Corpus1' );
        if( $TransObj->read_corpus(\%corp_dat) ) { ... }
        The corpus when read, is stored in a list of
        available corpuses(ii?).  This list is ordered,
        in the order they were read, numerically.  This
        is the preferred method for identifying a corpus
        to other methods, but most also accept a Name
        value to identify the corpus, which may be
        easier to track.  The numbering begins at 0.
        TODO: Convert all methods to a naming convention.

process HASHREF

    Actually performs processing on a given corpus.  Runs
    all of the callbacks, as needed, either on a per-corpus
    or per-element basis.  Generates the data that will be
    used to create a MIDI score later. Only a single corpus
    may be specified.  Returns true (1) if successful and
    sets the error message then returns undef otherwise.
    A single argument must be specified, which is either a hash
    reference or an anonymous hash.  The hash contains information
    identifying the corpus.  There are two keys possible:
            The name as given the corpus -- see read_corpus
            above.  Overrides the 'Num' key.  The value should
            be the name of the corpus.
            The number of the corpus.  See the NOTE section
            of read_corpus, above.  Is overriden by the 'Name'
        if( $TransObj->process( { 'Name' => 'Corpus1' }) ) {
            } else {
                my $errmsg = $TransObj->error();
        my %corp_dat = ( 'Num' => 1 );
        if( $TransObj->process(\%corp_dat) ) { ... }

create_score NUM

    Creates a score from the data generated by process(),
    suitable for writing to a file (see write_file() below).
    If the corpus hasn't been parsed, or the processing hasn't
    occured yet, and error will occur.  Returns the
    MIDI::Simple object from the score on success and
    sets the error message then returns undef on failure.
    One argument, the identifying number of the corpus
    must be specified.  (See the NOTE section of read_corpus()
        my $scoreObj;
        if( $scoreObj = $TransObj->create_score(0) ) {
            } else {
                my $errmsg = $TransObj->error();

write_file SCORE_OBJECT

    Writes a score to a file, given the score object
    returned from create_score.  Returns true (1) on
    success and sets the error message then returns
    undef on failure.
        if( $TransObj->write_file($scoreObj) ) {
            } else {
                my $errmsg = $TransObj->error();


    Returns the nearest rounded integer given
    decimal or integer input.

            # returns 2
        my $x = $TransObj->round(1.5);
            # returns 1
        my $x = $TransObj->round(1.38573927541);


    These methods allow you to configure the CallBacks
    currently in use by MIDI::Trans, they also allow
    you to retrieve a reference to the CallBack in
    question.  All methods accept an argument of either
    a subroutine reference or anonymous subroutine block.
    All methods return their subref.


    Sets / Returns the callback for volume
        my $vol_cb = $TransObj->volume_callback(\&somesub);


    Sets / Returns the callback for notes


        my $note_cb = $TransObj->note_callback(\&somesub);


    Sets / Returns the callback for duration


        my $dur_cb = $TransObj->duration_callback(\&somesub);


    Sets / Returns the callback for tempo


        my $tempo_cb = $TransObj->tempo_callback(\&somesub);


    These methods, in conjunction with methods such as
    round() (described under the main METHODS heading),
    are useful for modifying the way MIDI::Trans is
    operating, as well as assisting your callbacks in
    performing their operation, and retrieving operating
    values.  All of these (as well as others) may be
    utilized in your CallBacks.  Be careful, however of
    calling process() within a CallBack, as this may
    result in an infinite loop.


    Returns the current element from the list of
    elements in the corpus.  This would typically
    be used by your callbacks for determining what
    to generate.
        my $element = $TransObj->current_elem();


    Returns the position of the current element
    in the document, starting at zero.  That is,
    on the 50th element of the corpus, current_pos()
    would return '49'.

            my $pos = $TransObj->current_pos();


    Sets, or returns the current value of the
    'Raise_Error' attribute.
        my $RE = $TransObj->raise_error(1);


    Sets, or returns the current value of the
    'Delimiter' attribute.


        my $Del = $TransObj->delimiter(1);


    Sets, or returns the current value of the
    'SentenceDelimiter' attribute.


        my $SD = $TransObj->sentence_delimiter('\!\?');


    Sets, or returns the current value of the
    'AllAttributes' attribute.


        my $AA = $TransObj->all_attributes(1);


    Sets, or returns the current value of the
    'Channel' attribute.


        my $Chan = $TransObj->channel(1);


    Sets, or returns the current value of the
    'qn_len' attribute.


        my $QL = $TransObj->qn_len(1);


    Sets, or returns the current value of the
    'Tempo' attribute.  This is usually set
    by your tempo CallBack, but can also
    be set with a default value using the new()
    attribute.  You can use this function to
    override tempo, but this may have little,
    if any, effect on create_events().


        my $Del = $TransObj->delimiter(1);


C. Church <lt><gt>