NAME
Genealogy::Occupation - Normalise and translate genealogical occupation strings
VERSION
Version 0.01
SYNOPSIS
use Genealogy::Occupation;
my $normaliser = Genealogy::Occupation->new();
my @occupations = $normaliser->normalise(
occupation => 'Ag Lab',
sex => 'M',
);
# Returns ('Agricultural Labourer')
# Or pass an arrayref
my @more = $normaliser->normalise(
occupation => ['Ag Lab', 'Ag Lab', 'Retired'],
sex => 'M',
);
# Returns ('Agricultural Labourer') - deduplicated and filtered
DESCRIPTION
Normalises occupation strings found in genealogical records, handling common abbreviations, malformed entries, locale-specific spellings and translations into French and German.
Designed to handle poor-quality data from genealogy software imports where occupation strings may be abbreviated, inconsistent or use archaic terminology.
Processing steps applied in order:
- 1. Filter out non-occupations (Scholar, Retired, Domestic Duties etc)
- 2. Normalise abbreviations and malformed entries to canonical forms
- 3. Deduplicate consecutive identical or equivalent entries (compared on pre-translation normalised forms)
- 4. Apply locale-specific spellings via
Lingua::EN::ABC - 5. Translate to French or German if system locale requires it
METHODS
new
Purpose
Constructs a new normaliser object.
API Specification
Input
{
warn_on_error => {
type => 'boolean',
optional => 1,
default => 0,
},
}
Output
{ type => 'object', isa => 'Genealogy::Occupation' }
Arguments
warn_on_error- If true, unknown occupations that cannot be translated will emit a warning viacarprather than silently falling back to English. Optional, defaults to 0.
Returns
A blessed Genealogy::Occupation object.
Side Effects
None.
Notes
The system locale is detected once at construction time and cached for the lifetime of the object.
Example
my $normaliser = Genealogy::Occupation->new({
warn_on_error => 1,
});
normalise
Purpose
Normalises one or more occupation strings, applying filtering, deduplication, abbreviation expansion, locale spelling and translation in order.
API Specification
Input
{
occupation => {
type => ['string', 'arrayref'],
},
sex => {
type => 'string',
optional => 1,
memberof => ['M', 'F'],
},
}
Output
{
type => 'arrayref',
element_type => 'string',
}
Arguments
occupation- A single occupation string or an arrayref of occupation strings. Required.sex- The sex of the person,'M'or'F'. Optional but required for correct gendered translations in French and German. Defaults to'M'if not provided when a gendered translation is needed.
Returns
An arrayref of normalised occupation strings. May be empty if all occupations were filtered out.
Side Effects
If warn_on_error was set at construction and an occupation cannot be translated, emits a warning via carp.
Notes
Deduplication operates across the full list of occupations passed in. Processing a single occupation at a time will not deduplicate across multiple calls.
Example
my $result = $normaliser->normalise(
occupation => ['Ag Lab', 'Ag Lab', 'Retired'],
sex => 'M',
);
# Returns ['Agricultural Labourer']
my $result = $normaliser->normalise(
occupation => 'Platelayer Railway',
);
# Returns ['Railway Platelayer']
AUTHOR
Nigel Horne <njh@bandsman.co.uk>
BUGS
Please report bugs via the GitHub issue tracker: https://github.com/nigelhorne/Genealogy-Occupation/issues
TODO
Expand French and German translation tables
Add support for additional languages
Add
normalise_place()equivalent for occupation place strings
SEE ALSO
LICENSE AND COPYRIGHT
Copyright 2026 Nigel Horne.
Usage is subject to GPL2 licence terms. If you use it, please let me know.