NAME

Speech::Recognizer::SPX::Server - Perl module for writing streaming audio speech recognition servers using Sphinx2

SYNOPSIS

  my $sock = new IO::Socket(... blah blah blah ...);
  my $log = new IO::File('server.log');
  my $audio_fh = new IO::File('speech.raw');
  my $srvr
      = Speech::Recognizer::SPX::Server->init({ -arg => val, ... }, $sock, $log, $verbose)
        or die "couldn't initialize sphinx2: $!";

  my $client = new IO::Socket;
  while (accept $sock, $client) {
      next unless fork;
      $srvr->sock($client);
      $srvr->calibrate or die "couldn't calibrate audio stream: $!";
      while (!$done && defined(my $txt
                        = $srvr->next_utterance(sub { print $log "listening\n" },
                                                sub { print $log "not listening\n },
                                                $audio_fh))) {
          print "recognized text is $txt\n";
          ...
      }
      $srvr->fini or die "couldn't shut down server: $!";
      exit 0;
  }

DESCRIPTION

This module encapsulates a bunch of the stuff needed to write a Sphinx2 server which takes streaming audio as input on an arbitrary filehandle. It's not meant to be flexible or transparent - if you want that, then read the code and write your own server program using just the Speech::Recognizer::SPX module.

The interface is vaguely object-oriented, but unfortunately it is presently not possible to create multiple instances of Speech::Recognizer::SPX::Server within the same process, due to severe limitations of the underlying Sphinx-II library. You can, however, create multiple distinct servers with judicious use of fork, as shown in the example above.

It is possible that this will be fixed in a future release of Sphinx-II.

METHODS

init

  my $srvr = Speech::Recognizer::SPX::Server->init(\%args, $sock, $log, $verbose);

%args is a reference to a hash of argument => value pairs, exactly like the arguments you would pass on the command line to one of the sphinx example programs. Argument names can be given either with or without a leading dash.

$sock is a socket or other filehandle (could be anything, really) on which the server will read audio data. This argument is optional and not needed to initialize the server - you can set it later with the sock accessor.

$log is a filehandle on which the server module will log messages. This argument is optional. Without a filehandle to log on, these messages (boring things like "started listening at $foo") will not be printed.

$verbose determines the verbosity level of the Sphinx library. Currently, due to limitations in the Sphinx-II library, there are only two options for this value, namely a true value for 'be insanely verbose', or a false value for 'say nothing at all'.

calibrate

  $srvr->calibrate;

Calibrates the noise threshold for the continuous audio stream (i.e. figures out when it should listen and when it shouldn't). This requires you to actually have a ready and willing source of input on the socket you set in init or with sock.

next_utterance

  my $text = $srvr->next_utterance($cb_listen, $cb_not_listen, $audio_fh);

Waits for and recognizes the next utterance in the data stream. All arguments are optional:

$cb_listen is a reference to (or name of, but I encourage you not to do that) a subroutine to be called when the recognizer has detected speech input.

$cb_not_listen is a reference to (or name of) a subroutine to be called when the recognizer has detected the end of speech input.

Obviously this presumes a request/response model for your application. If you need to be able to get partial results then you'll have to wait for me to support them (which will undoubtedly happen sooner or later), or write your own module. Sorry.

$audio_fh is a filehandle to which to save the speech data - this may come in handy for debugging, or if you would like to only record the user talking and not the hours and hours of silence in between.

fini

Shuts down the Sphinx-II recognizer. Doesn't close the socket or anything though, you have to do that yourself.

ACCESSORS

sock

  my $sockfh = $srvr->sock;
  $srvr->sock(\*FOO);

Sets or gets the socket on which the server reads audio data.

logfh

  my $logfh = $srvr->logfh;
  $srvr->log(\*BAR);

Sets or gets the filehandle on which the server logs messages (if it's being verbose).

timeout

  $srvr->timeout(200);

Sets/gets the amount of time (in milliseconds) to wait after the end of speech-level input before processing an utterance. Default is one second.

AUTHOR

David Huggins-Daines <dhuggins@cs.cmu.edu>

To install Speech::Recognizer::SPX, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Speech::Recognizer::SPX

CPAN shell

perl -MCPAN -e shell
install Speech::Recognizer::SPX

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)