Gherkin - a parser and compiler for the Gherkin language


  use Gherkin;

  sub sink {
     my $msg = shift;
     use Data::Dumper;

     print Dumper($msg);

  my $id = 0;
  my gen { $id++ };

  Gherkin->from_paths( [ 'your.feature' ],
                       \&gen, \&sink );


This is the Perl implementation of the Gherkin language parser and compiler as developed by the Cucumber project (

Gherkin is a simple language, with a formal specification. The parser in this implementation is generated off the official language grammar.

NOTE Versions 21 and lower of this library used to send hashes to the $sink, whereas the current version sends Cucumber::Messages.


The Cucumber toolkit consists of a set of tools which form a pipe line: each consumes and produces protobuf messages (See Messages use ndjson formatting.

The start of the pipeline is the Gherkin language parser. Gherkin implements that functionality in Perl. It's the first building block in the pipe line and intended to be used to build further tooling upon.




Accepted %options are:


Boolean. Indicates whether the text of the source document is to be included in the output stream using a Source message.


Boolean. Indicates whether the parsed source (AST or Abstract Syntax Tree) is to be included in the output stream using a GherkinDocument message.


Boolean. Indicates whether the expanded-and-interpolated (executable) scenarios are to be included in the output stream using Pickle messages.

from_paths($paths, $id_gen, $sink, %options)

Constructs a Gherkin instance and calls its from_source method for each of the paths in the arrayref $paths.

$id_gen is a coderef to a function generating unique IDs which messages in the output stream can use to refer to other content in the stream. $sink is a coderef to a function taking the next message in the stream as its argument. Each message is encapsulated in an Envelope message.

%options are passed to new.


from_source($source_msg, $id_gen, $sink)

Generates a stream of AST and pickle messages sent to $sink. The source text in the message's data attribute is assumed to be utf8 or UTF-8 encoded. The document header is scanned for an # encoding: ... instruction. If one is found, the text is recoded from that encoding into Perl's internal Unicode representation.

The Source message sent to the sink is wrapped in an envelope which has a to_json method to create UTF-8 encoded NDJSON output.

$id_gen and $sink are as documented in from_paths.



Please see the included LICENSE.txt for the canonical version. In summary:

  The MIT License (MIT)

  Copyright (c) 2020-2021 Erik Huelsmann
  Copyright (c) 2016      Peter Sergeant

This work is a derivative of work that is: Copyright (c) 2014-2016 Cucumber Ltd, Gaspar Nagy