file-to-elasticsearch.pl - A simple utility to tail a file and index each line as a document in ElasticSearch
version 0.015
To see available options, run:
file-to-elasticsearch.pl --help
Create a config file and run the utility:
file-to-elasticsearch.pl --config config.yaml --log4perl logging.conf --debug
This will run a single threaded POE instance that will tail the log files you've requested, performing the requested transformations and sending them to the elasticsearch cluster and index you've specified.
The elasticsearch section of the config controls the settings passed to the POE::Component::ElasticSearch::Indexer.
elasticsearch
--- elasticsearch: servers: [ "localhost:9200" ] flush_interval: 30 flush_size: 1_000 index: logstash-%Y.%m.%d type: log
The settings available are:
An array of servers used to send bulk data to ElasticSearch. The default is just localhost on port 9200.
Every flush_interval seconds, the queued documents are send to the Bulk API of the cluster.
flush_interval
If this many documents is received, regardless of the time since the last flush, force a flush of the queued documents to the Bulk API.
A strftime compatible string to use as the DefaultIndex parameter if a file doesn't pass one along.
strftime
DefaultIndex
Mostly useless as Elastic is abandoning "types", but this will be set as the DefaultType for documents being indexed.
DefaultType
The files section contains the list of files to tail and the rules to use to index them.
files
--- tail: - file: '/var/log/osquery/result.log' index: "osquery-result-%Y.%m.%d" decode: json extract: - by: split from: name when: '^pack' into: 'pack' split_on: '/' split_parts: [ null, "name", "report" ] mutate: prune: true remove: [ "calendarTime", "epoch", "counter", "_raw" ] rename: unixTime: _epoch
Each element is a hash containing the following information.
Required: The path to the file on the filesystem.
This may be a single element, or an array, containing one or more of the implemented decoders.
Decode the discovered JSON in the document to a hash reference. This finds the first occurrence of an { in the string and assumes everything to the end of the string is JSON.
{
Decoding is done by JSON::MaybeXS.
Parses each line as a standard UNIX syslog message. Parsing is provided via Parse::Syslog::Line which isn't a hard requirement of the this package, but will be loaded if available.
A strftime compatible string to use as the index to put documents created from this file. If not specified, the defaults from the ElasticSearch section will be used, and failing that, the default as specified in POE::Component::ElasticSearch::Index.
The type to use for documents sourced from this file.
Extraction of fields from the document by one of the supported methods.
Can be 'split' or 'regex'.
split supports:
Regex or string to split the string on.
Name for each part of the split, undef positions in the split string will be discarded.
undef
regex supports:
The regex to use to extract, using capture groups to designate:
Name for reach captured group, undef positions in the list will be discarded.
Name of the field to apply the extraction to.
Limits applying the extraction to values matching the regex.
Top level namespace for the collected keys to wind up inside of, ie:
extract: - by: split from: name when: '^pack' into: 'pack' split_on: '/' split_parts: [ null, "name", "report" ]
Will look at the field name and when it matches ^pack it will split the name on / and index the second element to name and the third to report, so:
^pack
/
name
report
name: pack/os/cpu_info
Becomes:
pack: name: os report: cpu_info
Brad Lhotsky <brad@divisionbyzero.net>
This software is Copyright (c) 2018 by Brad Lhotsky.
This is free software, licensed under:
The (three-clause) BSD License
To install POE::Component::ElasticSearch::Indexer, copy and paste the appropriate command in to your terminal.
cpanm
cpanm POE::Component::ElasticSearch::Indexer
CPAN shell
perl -MCPAN -e shell install POE::Component::ElasticSearch::Indexer
For more information on module installation, please visit the detailed CPAN module installation guide.