subst - Greple module for text search and substitution


Version 2.3301


greple -Msubst --dict dictionary [ options ]

    --dict      dictionary file
    --dictdata  dictionary data


  File Update:
    --diffcmd command


This greple module supports check and substitution of text files based on dictionary data.

Dictionary file is given by --dict option and each line contains matching pattern and expected string pairs.

    greple -Msubst --dict DICT

If the dictionary file contains following data:

    colou?r      color
    cent(er|re)  center

above command finds the first pattern which does not match the second string, that is "colour" and "centre" in this case.

Field // in dictionary data is ignored, so this file can be written like this:

    colou?r      //  color
    cent(er|re)  //  center

You can use same file by greple's -f option and string after // is ignored as a comment in that case.

    greple -f DICT ...

Option --dictdata can be used to provide dictionary data in command line.

    greple --dictdata $'colou?r color\ncent(er|re) center\n'

Dictionary entry starting with a sharp sign (#) is a comment and ignored.

Overlapped pattern

When the matched string is same or shorter than previously matched string by another pattern, it is simply ignored (--no-warn-include by default). So, if you have to declare conflicted patterns, place the longer pattern earlier.

If the matched string overlaps with previously matched string, it is warned (--warn-overlap by default) and ignored.

Terminal color

This version uses Getopt::EX::termcolor module. It sets option --light-screen or --dark-screen depending on the terminal on which the command run, or TERM_BGCOLOR environment variable.

Some terminals (eg: "Apple_Terminal" or "iTerm") are detected automatically and no action is required. Otherwise set TERM_BGCOLOR environment to #000000 (black) to #FFFFFF (white) digit depending on terminal background color.



Specify dictionary file.


Specify dictionary data by text.


Option --check takes argument from ng, ok, any, outstand, all and none.

With default value outstand, command will show information about both expected and unexpected words only when unexpected word was found in the same file.

With value ng, command will show information about unexpected words. With value ok, you will get information about expected words. Both with value any.

Value all and none make sense only when used with --stat option, and display information about never matched pattern.


Select Nth entry from the dictionary. Argument is interpreted by Getopt::EX::Numbers module. Range can be defined like --select=1:3,7:9. You can get numbers by --stat option.


If the target data is folded in the middle of text, use --linefold option. It creates regex patterns which matches string spread across lines. Substituted text does not include newline, though. Because it confuses regex behavior somewhat, avoid to use if possible.


Print statistical information. Works with --check option.

Option --with-stat print statistics after normal output, while --stat print only statistics.


Using --stat-style=dict option with --stat and --check=any, you can get dictionary style output for your working document.

--stat-item item=[0,1]

Specify which item is shown up in stat information. Default values are:


If you don't need to see pattern field, use like this:

    --stat-item match=0

Multiple parameters can be set at once:

    --stat-item match=number=0,ng=1,ok=1

Substitute unexpected matched pattern to expected string. Newline character in the matched string is ignored. Pattern without replacement string is not changed.


Warn overlapped pattern. Default on.


Warn included pattern. Default off.



Option --diff produce diff output of original and converted text.

Specify diff command name used by --diff option. Default is "diff -u".


Create new file and write the result. Suffix ".new" is appended to original filename.


Replace the target file by converted result. Original file is renamed to backup name with ".bak" suffix.


Overwrite the target file by converted result with no backup.


This module includes example dictionaries. They are installed share directory and accessed by --exdict option.

    greple -Msubst --exdict jtca-katakana-guide-3.dict
--exdict dictionary

Use dictionary flie in the distribution as a dictionary file.


Show dictionary directory.

--exdict jtca-katakana-guide-3.dict

Created from following guideline document.

    外来語(カタカナ)表記ガイドライン 第3版
    Japan Technical Communicators Association

Customized --jtca-katakana-guide. Original dictionary is automatically generated from published data. This dictionary is customized for practical use.

--exdict jtf-style-guide-3.dict

Created from following guideline document.

    一般社団法人 日本翻訳連盟(JTF)

Customized --jtf-style-guide. Original dictionary is automatically generated from published data. This dictionary is customized for practical use.

--exdict sccc2.dict

Dictionary used for "C/C++ セキュアコーディング 第2版" published in 2014.
--exdict ms-style-guide.dict

Dictionary generated from Microsoft localization style guide.

Data is generated from this article:

Customized --ms-style-guide. Original dictionary is automatically generated from published data. This dictionary is customized for practical use.

Amendment dictionary can be found here. Please raise an issue or send a pull-request if you have request to update.


This module is originaly made for Japanese text editing support.


Japanese KATAKANA word have a lot of variants to describe same word, so unification is important but it's quite tiresome work. In the next example,

    イ[エー]ハトー?([ヴブボ]ォ?)  //  イーハトーヴォ

left pattern matches all following words.


This module helps to detect and correct them.



    $ cpanm App::Greple::subst


文化庁 国語施策・日本語教育 国語施策情報 内閣告示・内閣訓令 外来語の表記



Kazumasa Utashiro


Copyright 2017-2023 Kazumasa Utashiro.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.