API reference

The following is a list of all public methods in CherenkovDeconvolution.jl.

Deconvolution methods

All deconvolution methods implement the deconvolve function.

CherenkovDeconvolution.Methods.deconvolve — Function

deconvolve(m, X_obs, X_trn, y_trn)
deconvolve(prefit(m, X_trn, y_trn), X_obs)

Deconvolve the observed features in X_obs with the deconvolution method m trained on the features X_trn and the corresponding labels y_trn.

See also: prefit.

source

CherenkovDeconvolution.Methods.prefit — Function

prefit(m, X_trn, y_trn)

Return a copy of the deconvolution method m which is already trained on the features X_trn and the corresponding labels y_trn.

See also: deconvolve.

source

CherenkovDeconvolution.Methods.DSEA — Type

DSEA(classifier; kwargs...)

The DSEA/DSEA+ deconvolution method, embedding the given classifier.

Keyword arguments

f_0 = ones(m) ./ m defines the prior, which is uniform by default
fixweighting = true sets, whether or not the weight update fix is applied. This fix is proposed in my Master's thesis and in the corresponding paper.
stepsize = DEFAULT_STEPSIZE is the step size taken in every iteration.
smoothing = NoSmoothing() is an object that optionally applies smoothing in between iterations.
K = 1 is the maximum number of iterations.
epsilon = 0.0 is the minimum symmetric Chi Square distance between iterations. If the actual distance is below this threshold, convergence is assumed and the algorithm stops.
inspect = nothing is a function (f_k::Vector, k::Int, chi2s::Float64, alpha_k::Float64) -> Any optionally called in every iteration.
return_contributions = false sets, whether or not the contributions of individual examples in X_obs are returned as a tuple together with the deconvolution result.

source

CherenkovDeconvolution.Methods.IBU — Type

IBU(binning; kwargs...)

The Iterative Bayesian Unfolding deconvolution method, using a binning to discretize the observable features.

Keyword arguments

f_0 = ones(m) ./ m defines the prior, which is uniform by default.
smoothing = NoSmoothing() is an object that optionally applies smoothing in between iterations. The operation is neither applied to the initial prior, nor to the final result. The function inspect is called before the smoothing is performed.
K = 3 is the maximum number of iterations.
epsilon = 0.0 is the minimum symmetric Chi Square distance between iterations. If the actual distance is below this threshold, convergence is assumed and the algorithm stops.
stepsize = DEFAULT_STEPSIZE is the step size taken in every iteration.
inspect = nothing is a function (f_k::Vector, k::Int, chi2s::Float64, alpha_k::Float64) -> Any optionally called in every iteration.
warn = true determines whether warnings about negative values are emitted during normalization.
fit_ratios = false (discouraged) determines if ratios are fitted (i.e. R has to contain counts so that the ratio f_est / f_train is estimated) or if the probability density f_est is fitted directly.

source

CherenkovDeconvolution.Methods.PRUN — Type

PRUN(binning; kwargs...)

A version of the Regularized Unfolding method that is constrained to positive results. Like the original version, it uses a binning to discretize the observable features.

Keyword arguments

tau = 0.0 determines the regularisation strength.
K = 100 is the maximum number of iterations.
epsilon = 1e-6 is the minimum difference in the loss function between iterations. RUN stops when the absolute loss difference drops below epsilon.
f_0 = ones(size(R, 2)) Starting point for the interior-point Newton optimization.
acceptance_correction = nothing is a tuple of functions (ac(d), invac(d)) representing the acceptance correction ac and its inverse operation invac for a data set d.
ac_regularisation = true decides whether acceptance correction is taken into account for regularisation. Requires acceptance_correction != nothing.
log_constant = 1/18394 is a selectable constant used in log regularisation to prevent the undefined case log(0).
inspect = nothing is a function (f_k::Vector, k::Int, ldiff::Float64) -> Any called in each iteration.
warn = true determines whether warnings about negative values are emitted during normalization.
fit_ratios = false (discouraged) determines if ratios are fitted (i.e. R has to contain counts so that the ratio f_est / f_train is estimated) or if the probability density f_est is fitted directly.

source

CherenkovDeconvolution.Methods.RUN — Type

RUN(binning; kwargs...)

The Regularized Unfolding method, using a binning to discretize the observable features.

Keyword arguments

n_df = size(R, 2) is the effective number of degrees of freedom. The default n_df results in no regularization (there is one degree of freedom for each dimension in the result).
K = 100 is the maximum number of iterations.
epsilon = 1e-6 is the minimum difference in the loss function between iterations. RUN stops when the absolute loss difference drops below epsilon.
acceptance_correction = nothing is a tuple of functions (ac(d), invac(d)) representing the acceptance correction ac and its inverse operation invac for a data set d.
ac_regularisation = true decides whether acceptance correction is taken into account for regularisation. Requires acceptance_correction != nothing.
log_constant = 1/18394 is a selectable constant used in log regularisation to prevent the undefined case log(0).
inspect = nothing is a function (f_k::Vector, k::Int, ldiff::Float64, tau::Float64) -> Any optionally called in every iteration.
warn = true determines whether warnings about negative values are emitted during normalization.
fit_ratios = false (discouraged) determines if ratios are fitted (i.e. R has to contain counts so that the ratio f_est / f_train is estimated) or if the probability density f_est is fitted directly.

source

CherenkovDeconvolution.Methods.SVD — Type

SVD(binning; kwargs...)

The SVD-based deconvolution method, using a binning to discretize the observable features.

Keyword arguments

effective_rank = -1 is a regularization parameter which defines the effective rank of the solution. This rank must be <= dim(f). Any value smaller than one results turns off regularization.
N = sum(g) is the number of observations.
B = DeconvUtil.cov_Poisson(g, N) is the varianca-covariance matrix of the observed bins. The default value represents the assumption that each observed bin is Poisson-distributed with rate g[i]*N.
epsilon_C = 1e-3 is a small constant to be added to each diagonal entry of the regularization matrix C. If no such constant would be added, inversion of C would not be possible.
fit_ratios = true determines if ratios are fitted (i.e. R has to contain counts so that the ratio f_est / f_train is estimated) or if the probability density f_est is fitted directly.
warn = true determines whether warnings about negative values are emitted during normalization.

source

Binnings

Binnings are needed by the classical (discrete) deconvolution algorithms, e.g. IBU, PRUN, RUN, and SVD.

CherenkovDeconvolution.Binnings.TreeBinning — Type

TreeBinning(J, [preprocessor]; kwargs...)

A supervised tree binning strategy with an optional preprocessor and up to J clusters.

Keyword arguments

criterion = "gini" is the splitting criterion of the tree.
seed = rand(UInt32) is the random seed for tie breaking.

source

CherenkovDeconvolution.Binnings.KMeansBinning — Type

KMeansBinning(J, [preprocessor]; seed=rand(UInt32))

An unsupervised binning strategy with an optional preprocessor and up to J clusters.

source

CherenkovDeconvolution.Binnings.ClassificationPreprocessor — Type

ClassificationPreprocessor(classifier)

The output of a classifier is used as the input of the actual Binning.

source

CherenkovDeconvolution.Binnings.DefaultPreprocessor — Type

type DefaultPreprocessor <: BinningPreprocessor

A default preprocessor that does not transform the data.

source

Smoothings

Smoothings can regularize intermediate estimates, e.g. in IBU.

CherenkovDeconvolution.Smoothings.NoSmoothing — Type

NoSmoothing()

No smoothing; return the intermediate prior as it is.

source

CherenkovDeconvolution.Smoothings.PolynomialSmoothing — Type

PolynomialSmoothing(order)

Intermediate priors are smoothed with a polynomial of the given order.

impact = 1.0 linearly interpolate between the smoothed and the actual prior if 0 < impact < 1 (default: use smoothed version).
avg_negative = true replace negative values with the average of neighboring bins, as proposed in [dagostini2010improved]
warn = true specifies if a warnings about negative values are emitted

source

Stepsizes

Stepsizes can be used in DSEA and IBU. Combining the RunStepsize with DSEA yields the DSEA+ version of the algorithm. More information on stepsizes is given in the Manual.

CherenkovDeconvolution.OptimizedStepsizes.RunStepsize — Type

RunStepsize(binning; kwargs...)

Adapt the step size by maximizing the likelihood of the next estimate in the search direction of the current iteration, much like in the RUN deconvolution method.

Keyword arguments:

decay = false specifies whether a_k+1 <= a_k is enforced so that step sizes never increase.
tau = 0.0 determines the regularisation strength.
warn = false specifies whether warnings should be emitted for debugging purposes.

source

CherenkovDeconvolution.OptimizedStepsizes.LsqStepsize — Type

LsqStepsize(binning; kwargs...)

Adapt the step size by solving a least squares objective in the search direction of the current iteration.

Keyword arguments:

decay = false specifies whether a_k+1 <= a_k is enforced so that step sizes never increase.
tau = 0.0 determines the regularisation strength.
warn = false specifies whether warnings should be emitted for debugging purposes.

source

CherenkovDeconvolution.Stepsizes.ConstantStepsize — Type

ConstantStepsize(alpha)

Choose the constant step size alpha in every iteration.

source

CherenkovDeconvolution.Stepsizes.MulDecayStepsize — Type

MulDecayStepsize(eta, a=1.0)

Reduce the first stepsize a by eta in each iteration:

value(MulDecayStepsize(eta, a), k, ...) == a * k^(eta-1)

source

CherenkovDeconvolution.Stepsizes.ExpDecayStepsize — Type

ExpDecayStepsize(eta, a=1.0)

Reduce the first stepsize a by eta in each iteration:

value(ExpDecayStepsize(eta, a), k, ...) == a * eta^(k-1)

source

CherenkovDeconvolution.Stepsizes.DEFAULT_STEPSIZE — Constant

const DEFAULT_STEPSIZE = ConstantStepsize(1.0)

The default stepsize in all deconvolution methods.

source

DeconvUtil

The module DeconvUtil provides a rich set of user-level ulitity functions. We do not export the members of this module directly, so that you need to name the module when using its functions.

using CherenkovDeconvolution
fit_pdf([.3, .4, .3]) # WILL BREAK

# solution a)
DeconvUtil.fit_pdf([.3, .4, .3])

# solution b)
import DeconvUtil: fit_pdf
fit_pdf([.3, .4, .3])

CherenkovDeconvolution.DeconvUtil.fit_pdf — Function

fit_pdf(x[, bins]; normalize=true, laplace=false)

Obtain the discrete pdf of the integer array x, optionally specifying the array of bins.

The result is normalized by default. If it is not normalized now, you can do so later by calling DeconvUtil.normalizepdf.

Laplace correction means that at least one example is assumed in every bin, so that no bin has probability zero. This feature is disabled by default.

source

CherenkovDeconvolution.DeconvUtil.fit_R — Function

fit_R(y, x; bins_y, bins_x, normalize=true)

Estimate the detector response matrix R, which empirically captures the transfer from the integer array y to the integer array x.

R is normalized by default so that fit_pdf(x) == R * fit_pdf(y). If R is not normalized now, you can do so later calling DeconvUtil.normalizetransfer(R).

source

CherenkovDeconvolution.DeconvUtil.normalizetransfer — Function

normalizetransfer(R[; warn=true])

Normalize each column in R to make a probability density function.

source

CherenkovDeconvolution.DeconvUtil.normalizepdf — Function

normalizepdf(array...; warn=true)
normalizepdf!(array...; warn=true)

Normalize each array to a discrete probability density function.

By default, warn if coping with NaNs, Infs, or negative values.

source

CherenkovDeconvolution.DeconvUtil.normalizepdf! — Function

normalizepdf(array...; warn=true)
normalizepdf!(array...; warn=true)

Normalize each array to a discrete probability density function.

By default, warn if coping with NaNs, Infs, or negative values.

source

The following list of methods is primarily intended for developers who wish to implement their own deconvolution methods, binnings, stepsizes, etc. If you do so, please file a pull request so that others can benefit from your work! More information on how to develop for this package is given in the Developer manual.

CherenkovDeconvolution.Methods.DeconvolutionMethod — Type

abstract type DeconvolutionMethod

The supertype of all deconvolution methods.

source

CherenkovDeconvolution.Methods.DiscreteMethod — Type

abstract type DiscreteMethod <: DeconvolutionMethod

The supertype of all classical deconvolution methods which estimate the density function f from a transfer matrix R and an observed density g.

source

CherenkovDeconvolution.Binnings.Binning — Type

abstract type Binning

Supertype of all binning strategies for observable features.

source

CherenkovDeconvolution.Binnings.BinningDiscretizer — Type

abstract type BinningDiscretizer

Supertype of any clustering-based discretizer mapping from an n-dimensional space to a single cluster index dimension.

source

CherenkovDeconvolution.Binnings.bins — Function

bins(d::T) where T <: BinningDiscretizer

Return the bin indices of d.

source

Discretizers.encode — Function

encode(d::TreeDiscretizer, X_obs)

Discretize X_obs using the leaf indices in the decision tree of d as discrete values.

source

encode(d::KMeansDiscretizer, X_obs)

Discretize X_obs using the cluster indices of d as discrete values.

source

CherenkovDeconvolution.Stepsizes.Stepsize — Type

abstract type stepsize end

Abstract supertype for step sizes in deconvolution.

See also: stepsize.

source

CherenkovDeconvolution.OptimizedStepsizes.OptimizedStepsize — Type

OptimizedStepsize(objective, decay)

A step size that is optimized over an objective function. If decay=true, then the step sizes never increase.

See also: RunStepsize, LsqStepsize.

source

CherenkovDeconvolution.Stepsizes.initialize_prefit! — Function

initialize_prefit!(s, X_trn, y_trn)

Prepare the stepsize strategy s with the training set (X_trn, y_trn).

See also: initialize_deconvolve!.

source

CherenkovDeconvolution.Stepsizes.initialize_deconvolve! — Function

initialize_deconvolve!(s, X_obs)

Prepare the stepsize strategy s with the observed features in X_obs.

See also: initialize_prefit!.

source

CherenkovDeconvolution.Stepsizes.value — Function

value(s, k, p, f, a)

Use the Stepsize object s to compute a step size for iteration number k with the search direction p, the previous estimate f, and the previous step size a.

See also: ConstantStepsize, RunStepsize, LsqStepsize, ExpDecayStepsize, MulDecayStepsize.

source

CherenkovDeconvolution.Methods.check_prior — Function

check_prior(f_0, n_bins)

Throw meaningful exceptions if the input prior of a deconvolution run is defective.

source

CherenkovDeconvolution.Methods.check_arguments — Function

check_arguments(X_trn, y_trn)

Throw meaningful exceptions if the input data of a deconvolution run is defective.

source

CherenkovDeconvolution.Methods.LoneClassException — Type

LoneClassException(label)

An exception thrown by check_arguments when only one class is in the training set.

See also: recover_estimate

source

CherenkovDeconvolution.Methods.recover_estimate — Function

recover_estimate(x::LoneClassException, n_bins=1)

Recover a trivial deconvolution result from x, in which all bins are zero, except for the one that occured in the training set.

source

CherenkovDeconvolution.Methods.LabelSanitizer — Type

LabelSanitizer(y_trn, n_bins=expected_n_bins_y(y_trn))

A sanitizer that

encodes labels and priors so that none of the resulting bins is empty.
decodes deconvolution results to recover the original (possibly empty) bins.

See also: encode_labels, encode_prior, decode_estimate.

source

CherenkovDeconvolution.Methods.encode_labels — Function

encode_labels(s::LabelSanitizer, y_trn)

Encode the labels y_trn so that all values from 1 to max(y_trn) occur.

See also: encode_prior, decode_estimate.

source

CherenkovDeconvolution.Methods.encode_prior — Function

encode_prior(s::LabelSanitizer, f_0)

Encode the prior f_0 to be consistent with the encoded labels.

See also: encode_labels, decode_estimate.

source

CherenkovDeconvolution.Methods.decode_estimate — Function

decode_estimate(s::LabelSanitizer, f)

Recover the original bins in a deconvolution result f after encoding the labels.

See also: encode_labels, encode_prior.

source

API reference

Deconvolution methods

Binnings

Smoothings

Stepsizes

DeconvUtil

Developer interface