The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Name

Text::Markup::CommonMark - CommonMark Markdown parser for Text::Markup

Synopsis

  use Text::Markup::CommonMark;
  my $html = Text::Markup->new->parse(file => 'README.md');
  my $raw  = Text::Markup->new->parse(
      file    => 'README.md',
      options => [ raw => 1 ],
  );

Description

This is the CommonMark parser for Text::Markup. On load, it replaces the default Text::Markup::Markdown parser for parsing Markdown. Note that Text::Markup does not load this module by default, but when loaded manually will be the preferred Markdown parser.

Text::Markup::CommonMark reads in the file (relying on a BOM), hands it off to CommonMark for parsing, and then returns the generated HTML as an encoded UTF-8 string with an http-equiv="Content-Type" element identifying the encoding as UTF-8.

It recognizes files with the following extensions as CommonMark Markdown:

.md
.mkd
.mkdn
.mdown
.markdown

To change it the files it recognizes, load this module directly and pass a regular expression matching the desired extension(s), like so:

  use Text::Markup::CommonMark qr{markd?};

Normally this module returns the output wrapped in a minimal HTML document skeleton. If you would like the raw output without the skeleton, you can pass the raw option to parse.

In addition Text::CommonMark supports all of the CommonMark parse options and render options, including:

smart

When true, convert straight quotes to curly, --- to em dashes, -- to en dashes. Enabled by default.

sourcepos

When true, include a data-sourcepos attribute on all block elements. Disabled by default.

 =item C<hardbreaks>

When true, render soft-break elements as hard line breaks. Disabled by default.

 =item C<nobreaks>

When true, render soft-break elements as spaces. Disabled by default.

 =item C<validate_utf8>

When true, validate UTF-8 in the input before parsing, replacing illegal sequences with the replacement character U+FFFD. Disabled by default.

 =item C<unsafe>

Render raw HTML and unsafe links (javascript:, vbscript:, file:, and data:, except for image/png, image/gif, image/jpeg, or image/webp mime types). Raw HTML is replaced by a placeholder HTML comment. Unsafe links are replaced by empty strings. Enabled by default.

Author

David E. Wheeler <david@justatheory.com>

Copyright and License

Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.