LaTeX::Authors - Perl extension to extract authors and laboratories in a LaTeX file
Extraction from a string with latex commands:
use LaTeX::Authors; use strict; my $tex_string = "\documentclass..."; my @article = router($tex_string); my $string_xml = string_byauthors_xml(@article); print $string_xml;
Extraction from a latex file:
use LaTeX::Authors; use strict; my $file = shift; my $tex_string = load_file_string($file); my @article = router($tex_string); my $string_xml = string_byauthors_xml(@article); print $string_xml;
Extraction from a directory with latex files:
use LaTeX::Authors; use strict; my $directory = shift; #my $error= un_archive($directory); my $file = find_tex_file($directory); my $tex_string = load_file_string($file); my @article = router($tex_string); my $string_xml = string_byauthors_xml(@article); print $string_xml;
LaTeX::Authors try to find the authors and laboratories in a LaTex file. The output is an xml or html string. This is an example of the xml output:
<article> <item> <author>author1</author> <labo>lab1</labo> <labo>lab2</labo> </item> <item> ... </item> </article>
The module try to found something like the \author and \affiliation latex command on the file. With articles about physics try to found a collaboration name to work with more exotic way to show authors list. It is especially design for article about physics where there is hundreds of authors.
It can work on input with: - an archiv file (tar, zip...), it's useful for arXiv file (function un_archiv) - a directory with latex file (function find_tex_file) - a latex file (function load_file_string) - a string (function router)
For the output it can produce: - an xml string - by author: author1 lab1 lab2 (string _byauthors_xml) - by laboratory: lab1 author1 author2 (string_bylabs_xml) - an html string - by author (string_byauthors_html) - by lab (string_bylabs_html)
un_archive
Take the archive file and uncompress (useful for arXiv files)
my $error = un_archive($directory);
find_tex_file
my $texfile = find_tex_file($directory);
load_file_string
my $string = load_file_string($file);
Also delete the latex comments (%...).
router
@article = router($string);
found_collaboration
Useful for physics articles whrere there often a collaboration name. The authors list format can be found with the collaboration name. Used by the router function.
delete_comment
my $string_out = delete_comment($string_in);
bichop
With
my $string_in = bichop("{aaa}")
in $string_in there is:
"aaa"
greplatexcom
@l_section = greplatexcom("section",["title"],$string); for $s (@l_section) {print $s->{title} };
Optional arguments can be described with "[name]". See this example:
@class = greplatexcom("documentclass",[["args"],"class"],$string); print $class[0]->{class} ;
With \documentclass[xyz]{abc}
$class[0]->{args} = xyz $class[0]->{class} = abc
theenv
$abstract_string = theenv("abstract",$string);
theenv returns the contents of the environment "abstract".
For example if:
$string ="\begin{abstract} xyz... \end{abstract}";
after theenv in $abstract_string there is the string:
xyz...
theenvs
@array = theenvs("sloppypar",$string);
theenvs returns the contents of all the environment "sloopypar".
greplatexenv
@a = greplatexenv("letter",["to"],$string) ;
greplatexenv returns a list of all the ocurrences of environment "letter", reading its first argument to the "to" field and saving its content in the "env" field;
newcommand
%listnewcom = newcommand($string);
If you have
$string="\newcommand[xyz]{abc}";
so after newcommand:
$listnewcom{xyz} = "abc";
list_index
For example with:
my $command_name = "command"; %list = list_index($command_name,$string);
\command[index]{xyz...} -> $list{index} = "xyz...";
Generalize the function newcommand with any command.
accent
my $string_out = accent($string_in);
string_byauthors_xml
my $string = string_byauthors_xml(@article); <article> <item> <author>author1</author> <labo>lab1</labo> <labo>lab2</labo> </item> <item> ... </item> </article>
string_onlyauthors_xml
my $string = string_onlyauthors_xml(@article); <article> <author>author1</author> <author>author2</author> ... </article>
author_to_lab
my @array_lab = author_to_lab(@array_author);
(author1, lab1, lab2)(author2, lab1, lab3) -> (lab1,author1,author2)(lab2,author1)(lab3,author2)
string_bylabs_xml
my $xml_string = string_bylabs_xml(@article);
<article> <item> <labo>lab1</labo> <author>authors1</author> <author>authors2</author> </item> <item> ... </item> </article>
string_onlylabs_xml
my $string = string_onlylabs_xml(@article);
<article> <labo>lab1</labo> <labo>lab2</labo> ... </article>
string_byauthors_html
my $string_out = string_by_authors_html(@article);
<hr> author1 <p> <ul> <li> lab1 <li> lab2 </ul> <p>
string_bylabs_html
<hr> lab1 <p> <ul> <li> author1 <li> author2 </ul> <p>
Christian Rossi (<rossi@in2p3.fr> and <rossi@loria.fr>)
perl, latex, Text::Balanced.
To install LaTeX::Authors, copy and paste the appropriate command in to your terminal.
cpanm
cpanm LaTeX::Authors
CPAN shell
perl -MCPAN -e shell install LaTeX::Authors
For more information on module installation, please visit the detailed CPAN module installation guide.