The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Syntax::Highlight::Engine::Kate - a port to Perl of the syntax highlight engine of the Kate texteditor.

SYNOPSIS

 #if you want to create a compiled executable, you may want to do this:
 use Syntax::Highlight::Engine::Kate::All;
 
 my $hl = new Syntax::Highlight::Engine::Kate(
    language => 'Perl',
    substitutions => {
       "<" => "&lt;",
       ">" => "&gt;",
       "&" => "&amp;",
       " " => "&nbsp;",
       "\t" => "&nbsp;&nbsp;&nbsp;",
       "\n" => "<BR>\n",
    },
    format_table => {
       Alert => ["<font color=\"#0000ff\">", "</font>"],
       BaseN => ["<font color=\"#007f00\">", "</font>"],
       BString => ["<font color=\"#c9a7ff\">", "</font>"],
       Char => ["<font color=\"#ff00ff\">", "</font>"],
       Comment => ["<font color=\"#7f7f7f\"><i>", "</i></font>"],
       DataType => ["<font color=\"#0000ff\">", "</font>"],
       DecVal => ["<font color=\"#00007f\">", "</font>"],
       Error => ["<font color=\"#ff0000\"><b><i>", "</i></b></font>"],
       Float => ["<font color=\"#00007f\">", "</font>"],
       Function => ["<font color=\"#007f00\">", "</font>"],
       IString => ["<font color=\"#ff0000\">", ""],
       Keyword => ["<b>", "</b>"],
       Normal => ["", ""],
       Operator => ["<font color=\"#ffa500\">", "</font>"],
       Others => ["<font color=\"#b03060\">", "</font>"],
       RegionMarker => ["<font color=\"#96b9ff\"><i>", "</i></font>"],
       Reserved => ["<font color=\"#9b30ff\"><b>", "</b></font>"],
       String => ["<font color=\"#ff0000\">", "</font>"],
       Variable => ["<font color=\"#0000ff\"><b>", "</b></font>"],
       Warning => ["<font color=\"#0000ff\"><b><i>", "</b></i></font>"],
    },
 );
 
 #or
 
 my $hl = new Syntax::Highlight::Engine::Kate::Perl(
    substitutions => {
       "<" => "&lt;",
       ">" => "&gt;",
       "&" => "&amp;",
       " " => "&nbsp;",
       "\t" => "&nbsp;&nbsp;&nbsp;",
       "\n" => "<BR>\n",
    },
    format_table => {
       Alert => ["<font color=\"#0000ff\">", "</font>"],
       BaseN => ["<font color=\"#007f00\">", "</font>"],
       BString => ["<font color=\"#c9a7ff\">", "</font>"],
       Char => ["<font color=\"#ff00ff\">", "</font>"],
       Comment => ["<font color=\"#7f7f7f\"><i>", "</i></font>"],
       DataType => ["<font color=\"#0000ff\">", "</font>"],
       DecVal => ["<font color=\"#00007f\">", "</font>"],
       Error => ["<font color=\"#ff0000\"><b><i>", "</i></b></font>"],
       Float => ["<font color=\"#00007f\">", "</font>"],
       Function => ["<font color=\"#007f00\">", "</font>"],
       IString => ["<font color=\"#ff0000\">", ""],
       Keyword => ["<b>", "</b>"],
       Normal => ["", ""],
       Operator => ["<font color=\"#ffa500\">", "</font>"],
       Others => ["<font color=\"#b03060\">", "</font>"],
       RegionMarker => ["<font color=\"#96b9ff\"><i>", "</i></font>"],
       Reserved => ["<font color=\"#9b30ff\"><b>", "</b></font>"],
       String => ["<font color=\"#ff0000\">", "</font>"],
       Variable => ["<font color=\"#0000ff\"><b>", "</b></font>"],
       Warning => ["<font color=\"#0000ff\"><b><i>", "</b></i></font>"],
    },
 );
 
 
 print "<html>\n<head>\n</head>\n<body>\n";
 while (my $in = <>) {
    print $hl->highlightText($in);
 }
 print "</body>\n</html>\n";

DESCRIPTION

Syntax::Highlight::Engine::Kate is a port to perl of the syntax highlight engine of the Kate text editor.

The language xml files of kate have been rewritten to perl modules using a script. These modules function as plugins to this module.

Syntax::Highlight::Engine::Kate inherits Syntax::Highlight::Engine::Kate::Template.

OPTIONS

language

Specify the language you want highlighted. look in the PLUGINS section for supported languages.

plugins

If you created your own language plugins you may specify a list of them with this option.

 plugins => [
   ["MyModuleName", "MyLanguageName", "*,ext1;*.ext2", "Section"],
   ....
 ]
format_table

This option must be specified if the highlightText method needs to do anything usefull for you. All mentioned keys in the synopsis must be specified.

substitutions

With this option you can specify additional formatting options.

METHODS

extensions

returns a reference to the extensions hash,

language(?$language?)

Sets and returns the current language that is highlighted. when setting the language a reset is also done.

languageAutoSet($filename);

Suggests language name for the fiven file $filename

languageList

returns a list of languages for which plugins have been defined.

languagePlug($language);

returns the module name of the plugin for $language

languagePropose($filename);

Suggests language name for the fiven file $filename

sections

Returns a reference to the sections hash.

ATTRIBUTES

In the kate XML syntax files you find under the section <itemDatas> entries like <itemData name="Unknown Property" defStyleNum="dsError" italic="1"/>. Kate is an editor so it is ok to have definitions for forground and background colors and so on. However, since this Module is supposed to be a more universal highlight engine, the attributes need to be fully abstract. In which case, Kate does not have enough default attributes defined to fullfill all needs. Kate defines the following standard attributes: dsNormal, dsKeyword, dsDataType, dsDecVal, dsBaseN, dsFloat, dsChar, dsString, dsComment, dsOthers, dsAlert, dsFunction, dsRegionMarker, dsError. This module leaves out the "ds" part and uses following additional attributes: BString, IString, Operator, Reserved, Variable. I have modified the XML files so that each highlight mode would get it's own attribute. In quite a few cases still not enough attributes were defined. So in some languages different modes have the same attribute.

PLUGINS

Below an overview of existing plugins. All have been tested on use and can be created. The ones for which no testfile is available are marked untested. Those marked OK have highlighted the testfile without appearant mistakes. This does not mean that all bugs are shaken out.

 LANGUAGE            MODULE                   COMMENT
 ********            ******                   ******
 .desktop            Desktop                  OK
 4GL                 FourGL                   untested
 4GL-PER             FourGLminusPER           untested
 AHDL                AHDL                     OK
 ANSI C89            ANSI_C89                 untested
 ASP                 ASP                      OK
 AVR Assembler       AVR_Assembler            seems to have issues
 AWK                 AWK                      OK
 Ada                 Ada                      untested
                     Alerts                   OK hidden module
 Asm6502             Asm6502                  seems to have issues
 Bash                Bash                     seems to have issues
 BibTeX              BibTex                   OK
 C                   C                        untested
 C#                  Cdash                    untested
 C++                 Cplusplus                OK
 CGiS                CGiS                     untested
 CMake               CMake                    OK
 CSS                 CSS                      OK
 CUE Sheet           CUE_Sheet                untested
 Cg                  Cg                       untested
 ChangeLog           ChangeLog                untested
 Cisco               Cisco                    untested
 Clipper             Clipper                  OK
 ColdFusion          ColdFusion               untested
 Common Lisp         Common_Lisp              OK
 Component-Pascal    ComponentminusPascal     untested
 D                   D                        untested
 Debian Changelog    Debian_Changelog         untested
 Debian Control      Debian_Control           untested
 Diff                Diff                     untested
 Doxygen             Doxygen                  OK
 E Language          E_Language               OK
 Eiffel              Eiffel                   untested
 Euphoria            Euphoria                 OK
 Fortran             Fortran                  OK
 GDL                 GDL                      untested
 GLSL                GLSL                     OK
 GNU Assembler       GNU_Assembler            untested
 GNU Gettext         GNU_Gettext              untested
 HTML                HTML                     OK
 Haskell             Haskell                  OK
 IDL                 IDL                      untested
 ILERPG              ILERPG                   untested
 INI Files           INI_Files                untested
 Inform              Inform                   untested
 Intel x86 (NASM)    Intel_X86_NASM           seems to have issues
 JSP                 JSP                      OK
 Java                Java                     OK
 JavaScript          JavaScript               OK
 Javadoc             Javadoc                  untested
 KBasic              KBasic                   untested
 LDIF                LDIF                     untested
 LPC                 LPC                      untested
 LaTeX               LaTex                    seems to have issues
 Lex/Flex            Lex_Flex                 OK
 LilyPond            LilyPond                 OK
 Literate Haskell    Literate_Haskell         OK
 Lua                 Lua                      untested
 MAB-DB              MABminusDB               untested
 MIPS Assembler      MIPS_Assembler           untested
 Makefile            Makefile                 untested
 Mason               Mason                    untested
 Matlab              Matlab                   has issues
 Modula-2            Modulaminus2             untested
 Music Publisher     Music_Publisher          untested
 Octave              Octave                   OK
 PHP (HTML)          PHP_HTML                 OK
                     PHP_PHP                  OK hidden module
 POV-Ray             POV_Ray                  OK
 Pascal              Pascal                   untested
 Perl                Perl                     OK
 PicAsm              PicAsm                   seems to have issues
 Pike                Pike                     OK
 PostScript          PostScript               OK
 Prolog              Prolog                   untested
 PureBasic           PureBasic                OK
 Python              Python                   OK
 Quake Script        Quake_Script             untested
 R Script            R_Script                 untested
 REXX                REXX                     untested
 RPM Spec            RPM_Spec                 untested
 RSI IDL             RSI_IDL                  untested
 RenderMan RIB       RenderMan_RIB            OK
 Ruby                Ruby                     has issues
 SGML                SGML                     untested
 SML                 SML                      untested
 SQL                 SQL                      untested
 SQL (MySQL)         SQL_MySQL                untested
 SQL (PostgreSQL)    SQL_PostgreSQL           untested
 Sather              Sather                   untested
 Scheme              Scheme                   OK
 Sieve               Sieve                    untested
 Spice               Spice                    seems to have issues
 Stata               Stata                    OK
 TI Basic            TI_Basic                 untested
 Tcl/Tk              TCL_Tk                   OK
 UnrealScript        UnrealScript             OK
 VHDL                VHDL                     untested
 VRML                VRML                     seems to have issues
 Velocity            Velocity                 untested
 Verilog             Verilog                  untested
 WINE Config         WINE_Config              untested
 XML                 XML                      OK
 XML (Debug)         XML_Debug                untested
 Yacc/Bison          Yacc_Bison               OK
 ferite              Ferite                   untested
 progress            Progress                 untested
 scilab              Scilab                   untested
 txt2tags            Txt2tags                 untested
 xHarbour            XHarbour                 OK
 xslt                Xslt                     untested
 yacas               Yacas                    untested

BUGS

Float is detected differently than in the Kate editor.

The regular expression engine of the Kate editor, qregexp, appears to be more tolerant to mistakes in regular expressions than perl. This might lead to error messages and differences in behaviour. Most of the problems were sorted out while developing, because error messages appeared. For as far as differences in behaviour is concerned, testing is the only way to find out, so i hope the users out there will be able to tell me more.

This module is mimicking the behaviour of the syntax highlight engine of the Kate editor. If you find a bug/mistake in the highlighting, please check if Kate behaves in the same way. If yes, the cause is likely to be found there.

TO DO

Rebuild the scripts i am using to generate the modules from xml files so they are more pro-actively tracking flaws in the build of the xml files like missing lists. Also regular expressions in the xml can be tested better before used in plugins.

Refine the testmethods in Syntax::Highlight::Engine::Kate::Template, so that choices for casesensitivity, dynamic behaviour and lookahead can be determined at generate time of the plugin, might increase throughput.

implement codefolding.

ACKNOWLEDGEMENTS

All the people who wrote Kate and the syntax highlight xml files.

AUTHOR AND COPYRIGHT

This module is written and maintained by:

Hans Jeuken < haje at toneel dot demon dot nl >

Copyright (c) 2006 by Hans Jeuken, all rights reserved.

You may freely distribute and/or modify this module under the same terms as Perl itself.

SEE ALSO

Synax::Highlight::Engine::Kate::Template http:://www.kate-editor.org