The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

srtalign - align movie subtitles based on time overlaps

USAGE

 srtalign [OPTIONS] source-file.xml target-file.xml > aligned.xml

OPTIONS

 -S source-lang . source language ID
 -T target-lang . target language ID
 -c score ....... use cognates with LCSR>=score
 -r score-range . use cognates in a certain range 1..score and take best
 -l length ...... set minimal length of cognates (if used)
 -i len ......... use identical strings with length>=len
 -w size ........ set size for sliding window
 -d dic ......... use dictionary in file 'dic'
 -u ............. cognates/identicals that start with upper case only
 -r char_set .... define a set of characters to be used for matching
 -q ............. normalize length scores with (current) word frequencies
 -b ............. use "best" alignment (least empty alignments)
 -p nr .......... stop after <nr> candidates (when using -b)
 -m MAX ......... in "best" alignment: use only MAX first & MAX last
                  (default = 10; 0 = all)
 -f uplug-conf .. use fallback aligner if necessary
 -P ............. use proportion of non-empty alignments as scoring function
 -v ............. verbose output

The aligner uses the installed dictionaries if source language (-S) AND target language (-T) are given AND a dictionary for the given language pair is installed on the system (in the shared dir of the Text::SRT::Align package). If a dictionary is found it also assumes the best-align-mode (usually set by -b)

Cognates/identicals are used to set time ratio + time offset! They define reference points that will be used to compute - time scaling factor - time offset between source and target subtitles. The script looks for these anchor points in the beginning and at the end of each subtitle file (size of the windows defines how far from the start and the end it'll look). The similarity score is normailzed by the distances from start/end only two points will be used (one from the begiining and one from the end with the best scores)

AUTHOR

Jörg Tiedemann, https://bitbucket.org/tiedemann

BUGS AND SUPPORT

Please report any bugs or feature requests to https://bitbucket.org/tiedemann/subalign.

SEE ALSO

More information can be found in Text::SRT::Align

LICENSE AND COPYRIGHT

Copyright 2013 Jörg Tiedemann.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.