getdist.chains

class getdist.chains.Chains(root=None, jobItem=None, paramNamesFile=None, names=None, labels=None, **kwargs)[source]

Holds one or more sets of weighted samples, for example a set of MCMC chains. Inherits from WeightedSamples, also adding parameter names and labels

Variables:

paramNames – a ParamNames instance holding the parameter names and labels

Parameters:
  • root – optional root name for files
  • jobItem – optional jobItem for parameter grid item
  • paramNamesFile – optional filename of a .paramnames files that holds parameter names
  • names – optional list of names for the parameters
  • labels – optional list of latex labels for the parameters
  • kwargs – extra options for WeightedSamples‘s constructor
addDerived(paramVec, name, **kwargs)[source]

Adds a new parameter

Parameters:
  • paramVec – The vector of parameter values to add.
  • name – The name for the new parameter
  • kwargs – arguments for paramnames’ addDerived()
Returns:

The added parameter’s ParamInfo object

deleteFixedParams()[source]

Delete parameters that are fixed (the same value in all samples)

filter(where)[source]

Filter the stored samples to keep only samples matching filter

Parameters:where – list of sample indices to keep, or boolean array filter (e.g. x>5 to keep only samples where x>5)
getGelmanRubin(nparam=None, chainlist=None)[source]

Assess the convergence using the maximum var(mean)/mean(var) of orthogonalized parameters c.f. Brooks and Gelman 1997.

Parameters:
  • nparam – The number of parameters, by default uses all
  • chainlist – list of WeightedSamples, the samples to use. Defaults to all the separate chains in this instance.
Returns:

The worst var(mean)/mean(var) for orthogonalized parameters. Should be <<1 for good convergence.

getGelmanRubinEigenvalues(nparam=None, chainlist=None)[source]

Assess convergence using var(mean)/mean(var) in the orthogonalized parameters c.f. Brooks and Gelman 1997.

Parameters:
  • nparam – The number of parameters (starting at first), by default uses all of them
  • chainlist – list of WeightedSamples, the samples to use. Defaults to all the separate chains in this instance.
Returns:

array of var(mean)/mean(var) for orthogonalized parameters

getParamNames()[source]

Get ParamNames object with names for the parameters

Returns:ParamNames object giving parameter names and labels
getParamSampleDict(ix)[source]

Returns a dictionary of parameter values for sample number ix

getParams()[source]

Creates a ParSamples object, with variables giving vectors for all the parameters, for example samples.getParams().name1 would be the vector of samples with name ‘name1’

Returns:A ParSamples object containing all the parameter vectors, with attributes given by the parameter names
getSeparateChains()[source]

Gets a list of samples for separate chains. If the chains have already been combined, uses the stored sample offsets to reconstruct the array (generally no array copying)

Returns:The list of WeightedSamples for each chain.
loadChains(root, files, ignore_lines=None)[source]

Loads chains from files.

Parameters:
  • root – Root name
  • files – list of file names
  • ignore_lines – Amount of lines at the start of the file to ignore, None if should not ignore
Returns:

True if loaded successfully, False if none loaded

makeSingle()[source]

Combines separate chains into one samples array, so self.samples has all the samples and this instance can then be used as a general WeightedSamples instance.

Returns:self
removeBurnFraction(ignore_frac)[source]

Remove a fraction of the samples as burn in

Parameters:ignore_frac – fraction of sample points to remove from the start of the samples, or each chain if not combined
saveAsText(root, chain_index=None, make_dirs=False)[source]

Saves the samples as text files, including parameter names as .paramnames file.

Parameters:
  • root – The root name to use
  • chain_index – Optional index to be used for the filename, zero based, e.g. for saving one of multiple chains
  • make_dirs – True if this should (recursively) create the directory if it doesn’t exist
savePickle(filename)[source]

Save the current object to a file in pickle format

Parameters:filename – The file to write to
setParamNames(names=None)[source]

Sets the names of the params.

Parameters:names – Either a ParamNames object, the name of a .paramnames file to load, a list of name strings, otherwise use default names (param1, param2...).
setParams(obj)[source]

Adds array variables obj.name1, obj.name2 etc, where obj.name1 is the vector of samples with name ‘name1’

if a parameter name is of the form aa.bb.cc, it makes subobjects so you can reference obj.aa.bb.cc

Parameters:obj – The object instance to add the parameter vectors variables
Returns:The obj after alterations.
updateBaseStatistics()[source]

Updates basic computed statistics for this chain, e.g. after any changes to the samples or weights

Returns:self after updating statistics.
class getdist.chains.ParSamples[source]

An object used as a container for named parameter sample arrays

class getdist.chains.ParamConfidenceData[source]

a cache object for confidence interval data

exception getdist.chains.WeightedSampleError[source]

An exception that is raised when a WeightedSamples error occurs

class getdist.chains.WeightedSamples(filename=None, ignore_rows=0, samples=None, weights=None, loglikes=None, name_tag=None, label=None, files_are_chains=True)[source]

WeightedSamples is the base class for a set of weighted parameter samples

Variables:
  • weights – array of weights for each sample (default: array of 1)
  • loglikes – array of -log(Likelihoods) for each sample (default: array of 0)
  • samples – n_samples x n_parameters numpy array of parameter values
  • n – number of parameters
  • numrows – number of samples positions (rows in the samples array)
  • name_tag – name tag for the samples
Parameters:
  • filename – A filename of a plain text file to load from
  • ignore_rows
    • if int >=1: The number of rows to skip at the file in the beginning of the file
    • if float <1: The fraction of rows to skip at the beginning of the file
  • samples – array of parameter values for each sample, passed to setSamples()
  • weights – array of weights
  • loglikes – array of -log(Likelihood)
  • name_tag – The name of this instance.
  • label – latex label for these samples
  • files_are_chains – use False if the samples file (filename) does not start with two columns giving weights and -log(Likelihoods)
changeSamples(samples)[source]

Sets the samples without changing weights and loglikes.

Parameters:samples – The samples to set
confidence(paramVec, limfrac, upper=False, start=0, end=None, weights=None)[source]

Calculate sample confidence limits, not using kernel densities just counting samples in the tails

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • limfrac – fraction of samples in the tail, e.g. 0.05 for a 95% one-tail limit, or 0.025 for a 95% two-tail limit
  • upper – True to get upper limit, False for lower limit
  • start – Start index for the vector to use
  • end – The end index, use None to go all the way to the end of the vector.
  • weights – numpy array of weights for each sample, by default self.weights
Returns:

confidence limit (parameter value when limfac of samples are further in the tail)

cool(cool)[source]

Cools the samples, i.e. multiples log likelihoods by cool factor and re-weights accordingly

Parameters:cool – cool factor
corr(pars=None)[source]

Get the correlation matrix

Parameters:pars – If specified, list of parameter vectors or int indices to use
Returns:The correlation matrix.
cov(pars=None, where=None)[source]

Get parameter covariance

Parameters:
  • pars – if specified, a list of parameter vectors or int indices to use
  • where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:

The covariance matrix

deleteFixedParams()[source]

Removes parameters that do not vary (are the same in all samples)

Returns:list of fixed parameter indices that were removed
deleteZeros()[source]

Removes samples with zero weight

filter(where)[source]

Filter the stored samples to keep only samples matching filter

Parameters:where – list of sample indices to keep, or boolean array filter (e.g. x>5 to keep only samples where x>5)
getAutocorrelation(paramVec, maxOff=None, weight_units=True, normalized=True)[source]

Gets auto-correlation of an array of parameter values (e.g. for correlated samples from MCMC)

By default uses weight units (i.e. standard units for separate samples from original chain). If samples are made from multiple chains, neglects edge effects.

Parameters:
  • paramVec – an array of parameter values, or the int index of the parameter in stored samples to use
  • maxOff – maximum autocorrelation distance to return
  • weight_units – False to get result in sample point (row) units; weight_units=False gives standard definition for raw chains
  • normalized – Set to False to get covariance (note even if normalized, corr[0]<>1 in general unless weights are unity).
Returns:

zero-based array giving auto-correlations

getCorrelationLength(j, weight_units=True, min_corr=0.05, corr=None)[source]

Gets the auto-correlation length for parameter j

Parameters:
  • j – The index of the parameter to use
  • weight_units – False to get result in sample point (row) units; weight_units=False gives standard definition for raw chains
  • min_corr – specifies a minimum value of the autocorrelation to use, e.g. where sampling noise is typically as large as the calculation
  • corr – The auto-correlation array to use, calculated internally by default using getAutocorrelation()
Returns:

the auto-correlation length

getCorrelationMatrix()[source]

Get the correlation matrix of all parameters

Returns:The correlation matrix
getCov(nparam=None, pars=None)[source]

Get covariance matrix of the parameters. By default uses all parameters, or can limit to max number or list.

Parameters:
  • nparam – if specified, only use the first nparam parameters
  • pars – if specified, a list of parameter indices (0,1,2..) to include
Returns:

covariance matrix.

getEffectiveSamples(j=0, min_corr=0.05)[source]

Gets effective number of samples N_eff so that the error on mean of parameter j is sigma_j/N_eff

Parameters:
  • j – The index of the param to use.
  • min_corr – the minimum value of the auto-correlation to use when estimating the correlation length
getEffectiveSamplesGaussianKDE(paramVec, h=0.2, scale=None, maxoff=None, min_corr=0.05)[source]

Roughly estimate an effective sample number for use in the leading term for the MISE (mean integrated squared error) of a Gaussian-kernel KDE (Kernel Density Estimate). This is used for optimizing the kernel bandwidth, and though approximate should be better than entirely ignoring samples correlations, or only counting distinct samples.

Uses fiducial assumed kernel scale h; result does depend on this (typically by factors O(2))

For bias-corrected KDE only need very rough estimate to use in rule of thumb for bandwidth.

In the limit h-> 0 (but still >0) answer should be correct (then just includes MCMC rejection duplicates). In reality correct result for practical h should depends on shape of the correlation function

Parameters:
  • paramVec – parameter array, or int index of parameter to use
  • h – fiducial assumed kernel scale.
  • scale – a scale parameter to determine fiducial kernel width, by default the parameter standard deviation
  • maxoff – maximum value of auto-correlation length to use
  • min_corr – ignore correlations smaller than this auto-correlation
Returns:

A very rough effective sample number for leading term for the MISE of a Gaussian KDE.

getLabel()[source]

Return the latex label for the samples

Returns:the label
getMeans()[source]

Gets the parameter means, from saved array if previously calculated.

Returns:numpy array of parameter means
getName()[source]

Returns the name tag of these samples.

Returns:The name tag
getSignalToNoise(params, noise=None, R=None, eigs_only=False)[source]

Returns w, M, where w is the eigenvalues of the signal to noise (small means better constrained)

Parameters:
  • params – list of parameters indices to use
  • noise – noise matrix
  • R – rotation matrix, defaults to inverse of Cholesky root of the noise matrix
  • eigs_only – only return eigenvalues
Returns:

w, M, where w is the eigenvalues of the signal to noise (small means better constrained)

getVars()[source]

Get the parameter variances

Returns:A numpy array of variances.
get_norm(where=None)[source]

gets the normalization, the sum of the sample weights: sum_i w_i

Parameters:where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:normalization
initParamConfidenceData(paramVec, start=0, end=None, weights=None)[source]

Initialize cache of data for calculating confidence intervals

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • start – The sample start index to use
  • end – The sample end index to use, use None to go all the way to the end of the vector
  • weights – A numpy array of weights for each sample, defaults to self.weights
Returns:

ParamConfidenceData instance

mean(paramVec, where=None)[source]

Get the mean of the given parameter vector.

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:

parameter mean

mean_diff(paramVec, where=None)[source]

Calculates an array of differences between a parameter vector and the mean parameter value

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:

array of p_i - mean(p_i)

mean_diffs(pars=None, where=None)[source]

Calculates a list of parameter vectors giving distances from parameter means

Parameters:
  • pars – if specified, list of parameter vectors or int parameter indices to use
  • where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:

list of arrays p_i-mean(p-i) for each parameter

randomSingleSamples_indices()[source]

Returns an array of sample indices that give a list of weight-one samples, by randomly selecting samples depending on the sample weights

Returns:array of sample indices
removeBurn(remove=0.3)[source]

removes burn in from the start of the samples

Parameters:remove – fraction of samples to remove, or if int >1, the number of sample rows to remove
reweightAddingLogLikes(logLikes)[source]

Importance sample the samples, by adding logLike (array of -log(likelihood values) to the currently stored likelihoods, and re-weighting accordingly, e.g. for adding a new data constraint

Parameters:logLikes – array of -log(likelihood) for each sample to adjust
saveAsText(root, chain_index=None, make_dirs=False)[source]

Saves the samples as text files

Parameters:
  • root – The root name to use
  • chain_index – Optional index to be used for the samples’ filename, zero based, e.g. for saving one of multiple chains
  • make_dirs – True if this should create the directories if necessary.
setColData(coldata, are_chains=True)[source]

Set the samples given an array loaded from file

Parameters:
  • coldata – The array with columns of [weights, -log(Likelihoods)] and sample parameter values
  • are_chains – True if coldata starts with two columns giving weight and -log(Likelihood)
setDiffs()[source]

saves self.diffs array of parameter differences from the means, e.g. to later calculate variances etc.

Returns:array of differences
setMeans()[source]

Calculates and saves the means for the samples

Returns:numpy array of parameter means
setSamples(samples, weights=None, loglikes=None)[source]

Sets the samples from numpy arrays

Parameters:
  • samples – The samples values, n_samples x n_parameters numpy array, or can be a list of parameter vectors
  • weights – Array of weights for each sample. Defaults to 1 for all samples if unspecified.
  • loglikes – Array of -log(Likelihood) values for each sample
std(paramVec, where=None)[source]

Get the standard deviation of the given parameter vector.

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:

parameter standard deviation.

thin(factor)[source]

Thin the samples by the given factor, giving set of samples with unit weight

Parameters:factor – The factor to thin by
thin_indices(factor, weights=None)[source]

Indices to make single weight 1 samples. Assumes integer weights.

Parameters:
  • factor – The factor to thin by, should be int.
  • weights – The weights to thin, None if this should use the weights stored in the object.
Returns:

array of indices of samples to keep

twoTailLimits(paramVec, confidence)[source]

Calculates two-tail equal-area confidence limit by counting samples in the tails

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • confidence – confidence limit to calculate, e.g. 0.95 for 95% confidence
Returns:

min, max values for the confidence interval

var(paramVec, where=None)[source]

Get the variance of the given parameter vector.

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:

parameter variance

weighted_sum(paramVec, where=None)[source]

Calculates the weighted sum of a parameter vector, sum_i w_i p_i

Parameters:
  • paramVec – array of parameter values or int index of parameter to use
  • where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
Returns:

weighted sum

getdist.chains.chainFiles(root, chain_indices=None, ext='.txt', first_chain=0, last_chain=-1, chain_exclude=None)[source]

Creates a list of file names for samples given a root name and optional filters

Parameters:
  • root – Root name for files (no extension)
  • chain_indices – If True, only indexes inside the list included, If False, includes all indexes.
  • ext – extension for files
  • first_chain – The first index to include.
  • last_chain – The last index to include.
  • chain_exclude – A list of indexes to exclude, None to include all
Returns:

The list of file names

getdist.chains.covToCorr(cov, copy=True)[source]

Convert covariance matrix to correlation matrix

Parameters:
  • cov – The covariance matrix to work on
  • copy – True if we shouldn’t modify the input matrix, False otherwise.
Returns:

correlation matrix

getdist.chains.getSignalToNoise(C, noise=None, R=None, eigs_only=False)[source]

Returns w, M, where w is the eigenvalues of the signal to noise (small means better constrained)

Parameters:
  • C – covariance matrix
  • noise – noise matrix
  • R – rotation matrix, defaults to inverse of Cholesky root of the noise matrix
  • eigs_only – only return eigenvalues
Returns:

eigenvalues and matrix

getdist.chains.lastModified(files)[source]

Returns the the latest “last modified” time for the given list of files. Ignores files that do not exist.

Parameters:files – An iterable of file names.
Returns:The latest “last modified” time
getdist.chains.loadNumpyTxt(fname, skiprows=None)[source]

Utility routine to loads numpy array from file. Uses faster pandas read routine if pandas is installed, or falls back to numpy’s loadtxt otherwise

Parameters:
  • fname – The file to load
  • skiprows – The number of rows to skip at the begging of the file
Returns:

numpy array of the data values