The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

bin_adaptive_snr

  %hash = bin_adaptive_snr( %options  );

Adaptively bin a data set to achieve a minimum signal to noise ratio in each bin.

This routine ignores data with bad values or with errors that have bad values.

bin_adaptive_snr groups data into bins such that each bin meets one or more conditions:

  • a minimum signal to noise ratio (S/N).

  • A minimum number of data elements (optional).

  • A maximum number of data elements (optional).

  • A maximum data width (see below) (optional).

  • A minimum data width (see below) (optional).

The data are typically dependent values (e.g. flux as a function of energy or counts as a function of radius). The data should be sorted by the independent variable (e.g. energy or radius).

Calculation of the S/N requires an estimate of the error associated with each datum. The error may be provided or may be estimated from the population using either the number of data elements in a bin (e.g. Poisson errors) or the standard deviation of the signal in a bin. If errors are provided, they may be used to weight the population standard deviation or may be added in quadrature.

Binning begins at the start of the signal vector. Data are accumulated into a bin until one or more of the possible criteria is met. If the final bin does not meet the required criteria, it may optionally be successively folded into preceding bins until the final bin passes the criteria or there are no bins left.

Each datum may be assigned an extra parameter, its width, which is summed for each bin, and can be used as an additional constraint on bin membership.

Parameters

bin_adaptive_snr is passed a hash or a reference to a hash containing its parameters. The available parameters are:

signal

A piddle containing the signal data. This is required.

error

A piddle with the error for signal datum. Optional.

width

A piddle with the width of each element of the signal. Optional.

error_algo

A string indicating how the error is to be handled or calculated. It may be have one of the following values:

  • poisson

    Poisson errors will be calculated based upon the number of elements in a bin,

      error**2 = N

    Any input errors are ignored.

  • sdev

    The error is the population standard deviation of the signal in a bin.

      error**2 = Sum [ ( signal - mean ) **2 ] / ( N - 1 )

    If errors are provided, they are used to calculated the weighted population standard deviation.

      error**2 = ( Sum [ (signal/error)**2 ] / Sum [ 1/error**2 ] - mean**2 )
                 * N / ( N - 1 )
  • rss

    Errors must be provided; the errors of elements in a bin are added in quadrature.

min_snr

The minimum signal to noise ratio to be achieved in each bin. Required.

min_nelem
max_nelem

The minimum and/or maximum number of elements to be achieved in each bin. Optional

min_width
max_width

The minimum and/or maximum width of the elements to be achieved in each bin. Optional.

fold boolean

If true, the last bin may be folded into the preceding bin in order to ensure that the last bin meets one or more of the criteria. It defaults to false.

Results

bin_adaptive_snr returns a hashref with the following entries:

index

A piddle containing the bin indices for the elements in the input data piddle. Data which were skipped because of bad values will have their index set to the bad value.

nbins

A piddle containing the number of bins which spanned the range of the input data.

signal

A piddle containing the sum of the data values in each bin. Only indices 0 through nbins -1 are valid.

nelem

A piddle containing the number of data elements in each bin. Only indices 0 through nbins -1 are valid.

error

A piddle containing the errors in each bin, calculated using the algorithm specified via error_algo. Only indices 0 through nbins -1 are valid.

mean

A piddle containing the weighted mean of the signal in each bin. Only indices 0 through nbins -1 are valid.

ifirst

A piddle containing the index into the input data piddle of the first data value in a bin. Only indices 0 through nbins -1 are valid.

ilast

A piddle containing the index into the input data piddle of the last data value in a bin. Only indices 0 through nbins -1 are valid.

rc

A piddle containing a results code for each output bin. Only indices 0 through nbins -1 are valid. The code is the bitwise "or" of the following constants (available in the PDLx::Bin1D namespace)

BIN_RC_OK

The bin met the minimum S/N, data element count and weight requirements

BIN_RC_GEWMAX

The bin weight was greater or equal to that requested.

BIN_RC_GENMAX

The number of data elements was greater or equal to that requested.

BIN_RC_FOLDED

The bin is the result of folding bins at the end of the bin vector to achieve a minimum S/N.

BIN_RC_GTMINSN

The bin accumulated more data elements than was necessary to meet the S/N requirements. This results from constraints on the minimum number of data elements or bin weight.