dbcolpercentile - compute percentiles or ranks for an existing numeric column
dbcolpercentile [-rplhS] [--mode MODE] [--value WEIGHT_COL] column
Compute a percentile, ranking, or weighted percentile of a column of numbers. The new column will be called percentile:d or rank:q or weighted:d depending on the mode.
Ordering is given by the specifed column.
In weighted mode, by default the same column as ordering is used for weighting. Alternatively, give a different column for weighting with -v.
-v
Non-numeric values are ignored.
If the data is pre-sorted and only a rank is requested, no extra storage is required. In all other cases, a full copy of data is buffered on disk. Output will be sorted by COLUMN.
Show percentile (default). Percentile is the fraction of the cumulative values at or lower than the current value, relative to the total count.
Compute ranks instead of percentiles.
Compute the weighted percentile. Here values define not only the ordering, but the fraction of the total sum, and percentile is the fraction of sum of cumulative values in the weighting column (relative to their sum), for all ranking colums at or lower than the current ranking column. If the weight column is not specified (with --mode weighted), it is the same as the ranking column.
--mode weighted
Compute stats over all records (treat non-numeric records as zero rather than just ignoring them).
Assume data is already sorted. With one -S, we check and confirm this precondition. When repeated, we skip the check.
Give the NAME of the new column. (If no type is specifed, a type will be assigned based on the mode.)
Specify a printf(3)-style format for output statistics. Defaults to %.5g.
%.5g
where to put tmp files. Also uses environment variable TMPDIR, if -T is not specified. Default is /tmp.
Specify the value any non-numeric rows get, if in weighted mode.
This module also supports the standard fsdb options:
Enable debugging output.
Read from InputSource, typically a file name, or - for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.
-
Write to OutputDestination, typically a file name, or - for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.
By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The --(no)autorun option controls that behavior within Perl.
--(no)autorun
Show help.
Show full manual.
#fsdb name id test1 a 1 80 b 2 70 c 3 65 d 4 90 e 5 70 f 6 90
cat DATA/grades.fsdb | dbcolpercentile test1
#fsdb name id test1 percentile d 4 90 1 f 6 90 1 a 1 80 0.66667 b 2 70 0.5 e 5 70 0.5 c 3 65 0.16667 # | dbsort -n test1 # | dbcolpercentile test1
cat DATA/grades.fsdb | dbcolpercentile --rank test1
#fsdb name id test1 rank d 4 90 1 f 6 90 1 a 1 80 3 b 2 70 4 e 5 70 4 c 3 65 6 # | dbsort -n test1 # | dbcolpercentile --rank test1
Fsdb. dbcolhisto.
$filter = new Fsdb::Filter::dbcolpercentile(@arguments);
Create a new dbcolpercentile object, taking command-line arguments.
$filter->set_defaults();
Internal: set up defaults.
$filter->parse_options(@ARGV);
Internal: parse command-line arguments.
$filter->setup();
Internal: setup, parse headers.
$n = $self->_determinte_total()
Interpose a filter on $self-{_in}> that counts the rows (for rank or percentile) or sums the value (for weighted percentile).
$self-
$filter->run();
Internal: run over each rows.
Copyright (C) 1997-2024 by John Heidemann <johnh@isi.edu>
This program is distributed under terms of the GNU general public license, version 2. See the file COPYING with the distribution for details.
To install Fsdb, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Fsdb
CPAN shell
perl -MCPAN -e shell install Fsdb
For more information on module installation, please visit the detailed CPAN module installation guide.