The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

CommonMark - Interface to the CommonMark C library

SYNOPSIS

    use CommonMark;

    my $doc = CommonMark->parse(
        file  => $file,
        smart => 1,
    );

    my $html = CommonMark->markdown_to_html($markdown);
    my $doc  = CommonMark->parse_file($file);
    my $doc  = CommonMark->parse_document($markdown);
    my $doc  = CommonMark->create_document;

DESCRIPTION

This module is a wrapper around the official CommonMark C library libcmark. It closely follows the original API.

The main module provides some entry points to parse documents and convenience functions for node creation. The bulk of features is available through CommonMark::Node objects of which the parse tree is made. CommonMark::Iterator is a useful class to walk through the nodes in a tree. CommonMark::Parser provides a push parser interface.

Installation

Installation of libcmark

Please note that the libcmark API isn't stable yet. This version of the Perl bindings is known to work with all releases between 0.21.0 and 0.31.1, but there's no guarantee that it can be compiled with later versions. Also note that upgrading a dynamically linked version of libcmark may require recompilation of the Perl distribution.

It is recommended to use the libcmark packages provided by recent Linux distros. On Debian or Ubuntu, run:

    sudo apt-get install libcmark-dev

On Red Hat or CentOS, run:

    sudo yum install cmark-devel

On macOS using Homebrew, run:

    brew install cmark

To install libcmark from source:

    curl -LJO https://github.com/commonmark/cmark/archive/0.31.1.tar.gz
    tar xzf cmark-0.31.1.tar.gz
    cd cmark-0.31.1
    make [INSTALL_PREFIX=/prefix]
    make test
    make install

See the libcmark README for details.

Installation from a CPAN tarball

If libcmark is in a standard location:

    perl Makefile.PL
    make
    make test
    make install

On macOS using Homebrew, specify the include and library locations:

    perl Makefile.PL \
        INC="-I$(brew --prefix)/include" \
        LIBS="-L$(brew --prefix)/lib -lcmark"
    make
    make test
    make install

Otherwise, specify the include and library locations:

    perl Makefile.PL \
        INC="-I/prefix/include" \
        LIBS="-L/prefix/lib -lcmark"
    make
    make test
    make install

See the documentation of ExtUtils::MakeMaker for additional options. The PERL_MM_OPT environment variable is especially useful.

    export PERL_MM_OPT='INC="-I..." LIBS="-L... -lcmark"'

Build from a repository checkout

This distribution uses Dist::Zilla with the external plugins MakeMaker::Awesome and CopyFilesFromBuild. You can build and test with dzil:

    dzil test
    dzil build

The files generated by Dist::Zilla are included in the repository, so you can use the standard build process as well.

markdown_to_html

    my $html = CommonMark->markdown_to_html( $markdown, [$options] );

Converts a Markdown string to HTML. $options is a bit field containing the parse options and render options ORed together. It defaults to zero (OPT_DEFAULT).

This method is the equivalent to calling parse_document and then render_html on the resulting document, or calling parse and then render:

    my $html = CommonMark->markdown_to_html( $markdown );
    my $html = CommonMark->parse_document($markdown)->render_html;
    my $html = CommonMark->parse(string => $markdown)->render(format => 'html');

Equivalent calls with parser and rendering options, which can all be passed to markdown_to_html() but must be sent separately to the parse and render methods:

    my $html = CommonMark->markdown_to_html(
        $markdown,
        OPT_UNSAFE | OPT_SMART,
    );

    my $html = CommonMark->parse_document(
        $markdown, OPT_SMART,
    )->render_html(OPT_UNSAFE);

    my $html = CommonMark->parse(
        string => $markdown,
        smart  => 1,
    )->render(
        format => 'html',
        unsafe => 1,
    );

parse

    my $doc = CommonMark->parse(
        string        => $string,
        normalize     => $bool,    # Optional
        smart         => $bool,    # Optional
        validate_utf8 => $bool,    # Optional
    );

    my $doc = CommonMark->parse(
        file          => $handle,
        normalize     => $bool,    # Optional
        smart         => $bool,    # Optional
        validate_utf8 => $bool,    # Optional
    );

Convenience function to parse documents. Exactly one of the string or file options must be provided. When given a string, calls "parse_document". When given a file, calls "parse_file". The remaining options enable the respective parser options:

  • smart

  • validate_utf8

  • normalize (no-op as of libcmark 0.28)

Returns the CommonMark::Node of the root document.

parse_document

    my $doc = CommonMark->parse_document( $markdown, [$options] )

Parses a CommonMark document from a string returning the CommonMark::Node of the document root. $options is a bit field containing the parser options. It defaults to zero (OPT_DEFAULT).

parse_file

    my $doc = CommonMark->parse_file( $file, [$options] );

Parses a CommonMark document from a file handle returning the CommonMark::Node of the document root. $options is a bit field containing the parser options. It defaults to zero (OPT_DEFAULT).

Parser options

The parser and rendering options are a bit field created by ORing the following constants:

    CommonMark::OPT_DEFAULT => 0
    CommonMark::OPT_NORMALIZE
    CommonMark::OPT_VALIDATE_UTF8
    CommonMark::OPT_SMART

Parser options can be imported from CommonMark with tag opt.

    use CommonMark qw(:opt);
    my $doc = CommonMark->parse_document(
        $markdown,
        OPT_SMART | OPT_VALIDATE_UTF8,
    );
OPT_NORMALIZE

Makes sure that adjacent text nodes are merged in the parse tree. This option has no effect with libcmark 0.28 or higher which always normalizes text nodes.

OPT_SMART

Enables the "smart quote" feature which turns vertical into typographic quotation marks, double and triple hyphens into en and em dashes, and triple periods into ellipses.

OPT_VALIDATE_UTF8

Turns on UTF-8 validation. Normally, this shouldn't be necessary because Perl strings should always contain valid UTF-8. But it is possible to create strings flagged as UTF-8 that contain invalid UTF-8, for example with XS. The option may be used if you don't trust the input data and want to make absolutely sure that the output is valid UTF-8. If invalid bytes are found, they are replaced with the Unicode replacement character U+FFFD.

Node creation

    my $document = CommonMark->create_document(
        children => \@children,
    );
    my $header = CommonMark->create_heading(
        level    => $level,
        children => \@children,
        text     => $literal,
    );
    my $paragraph = CommonMark->create_paragraph(
        children => \@children,
        text     => $literal,
    );
    my $block_quote = CommonMark->create_block_quote(
        children => \@children,
    );
    my $list = CommonMark->create_list(
        type     => $type,
        delim    => $delim,
        start    => $start,
        tight    => $tight,
        children => \@children,
    );
    my $item = CommonMark->create_item(
        children => \@children,
    );
    my $code_block = CommonMark->create_code_block(
        fence_info => $fence_info,
        literal    => $literal,
    );
    my $html = CommonMark->create_html_block(
        literal => $html,
    );
    my $custom_block = CommonMark->create_custom_block(
        on_enter => $raw_prefix,
        on_exit  => $raw_suffix,
        children => \@children,
        text     => $literal,
    );
    my $thematic_break = CommonMark->create_thematic_break;
    my $text = CommonMark->create_text(
        literal => $literal,
    );
    my $code = CommonMark->create_code(
        literal => $literal,
    );
    my $html_inline = CommonMark->create_html_inline(
        literal => $literal,
    );
    my $emph = CommonMark->create_emph(
        children => \@children,
        text     => $literal,
    );
    my $strong = CommonMark->create_strong(
        children => \@children,
        text     => $literal,
    );
    my $url = CommonMark->create_url(
        url      => $url,
        title    => $title,
        children => \@children,
        text     => $literal,
    );
    my $image = CommonMark->create_image(
        url      => $url,
        title    => $title,
        children => \@children,
        text     => $literal,
    );
    my $custom_inline = CommonMark->create_custom_inline(
        on_enter => $raw_prefix,
        on_exit  => $raw_suffix,
        children => \@children,
        text     => $literal,
    );
    my $softbreak = CommonMark->create_softbreak;
    my $linebreak = CommonMark->create_linebreak;

These convenience functions can be used to create nodes, set properties, and add children in a single operation. All parameters are optional.

The children parameter expects an arrayref of nodes to be added as children. The special text parameter adds a single text child with literal $literal. It can't be used together with children. All other parameters correspond to a node property.

libcmark version information

    my $version = CommonMark->version;
    my $string  = CommonMark->version_string;
    my $version = CommonMark->compile_time_version;
    my $string  = CommonMark->compile_time_version_string;

Return the version number or version string of libcmark, either the library version linked against at run time or compile time.

COPYRIGHT

This software is copyright (C) by Nick Wellnhofer.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.