The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

SYNOPSIS

CP_reporter [options] path1 path2 ...

DESCRIPTION

CP_reporter reads the specified files and analyses the content for repeated substrings as they might occur when copying and pasting code.

When a duplication is found, the position and range is reported for each copy found and the duplicated content is printed once.

The findings are sorted by length in bytes.

The arguments are paths which can be directory paths or file paths or glob expressions.

OPTIONS

--minimum AMOUNT (default is set to 5 lines)

Sets the minimum extent a duplication should have to be reported. Two units are supported: positive numbers indicate 'lines' while negative numbers specify 'bytes'.

--dir DIR (default is set to 'lib')

Specifies the directory that is scanned recursively for files.

--ignore PATTERN (no default, may be given multiple times)

Suppress the report of all duplications that match the ignore filter.

For example /\A(^#.+$)+\z/ would suppress a duplication when all lines of it begin with a comment.

--terse

Suppress the output of the content of a duplication. Only positions of duplications are printed out.

BUGS & LIMITATIONS

No rc file yet.

Limited to 8-bit characters yet.

No text preprocessing yet ( e.g. whitespace shrinking or comments elimination)

No real parsing of Perl code yet.

ACKNOWLEDGEMENTS

This program was inspired a bit by Ovids find_duplicate_code.pl