getdist.chains¶
- class getdist.chains.Chains(root=None, jobItem=None, paramNamesFile=None, names=None, labels=None, renames=None, sampler=None, **kwargs)[source]¶
Holds one or more sets of weighted samples, for example a set of MCMC chains. Inherits from
WeightedSamples
, also adding parameter names and labels- Variables
paramNames – a
ParamNames
instance holding the parameter names and labels- Parameters
root – optional root name for files
jobItem – optional jobItem for parameter grid item. Should have jobItem.chainRoot and jobItem.batchPath
paramNamesFile – optional filename of a .paramnames files that holds parameter names
names – optional list of names for the parameters
labels – optional list of latex labels for the parameters
renames – optional dictionary of parameter aliases
sampler – string describing the type of samples (default :mcmc); if “nested” or “uncorrelated” the effective number of samples is calculated using uncorrelated approximation
kwargs – extra options for
WeightedSamples
’s constructor
- addDerived(paramVec, name, **kwargs)[source]¶
Adds a new parameter
- Parameters
paramVec – The vector of parameter values to add.
name – The name for the new parameter
kwargs – arguments for paramnames’
paramnames.ParamList.addDerived()
- Returns
The added parameter’s
ParamInfo
object
- filter(where)[source]¶
Filter the stored samples to keep only samples matching filter
- Parameters
where – list of sample indices to keep, or boolean array filter (e.g. x>5 to keep only samples where x>5)
- getGelmanRubin(nparam=None, chainlist=None)[source]¶
Assess the convergence using the maximum var(mean)/mean(var) of orthogonalized parameters c.f. Brooks and Gelman 1997.
- Parameters
nparam – The number of parameters, by default uses all
chainlist – list of
WeightedSamples
, the samples to use. Defaults to all the separate chains in this instance.
- Returns
The worst var(mean)/mean(var) for orthogonalized parameters. Should be <<1 for good convergence.
- getGelmanRubinEigenvalues(nparam=None, chainlist=None)[source]¶
Assess convergence using var(mean)/mean(var) in the orthogonalized parameters c.f. Brooks and Gelman 1997.
- Parameters
nparam – The number of parameters (starting at first), by default uses all of them
chainlist – list of
WeightedSamples
, the samples to use. Defaults to all the separate chains in this instance.
- Returns
array of var(mean)/mean(var) for orthogonalized parameters
- getParamNames()[source]¶
Get
ParamNames
object with names for the parameters- Returns
ParamNames
object giving parameter names and labels
- getParamSampleDict(ix, want_derived=True)[source]¶
Returns a dictionary of parameter values for sample number ix
- Parameters
ix – sample index
want_derived – include derived parameters
- Returns
ordered dictionary of parameter values
- getParams()[source]¶
Creates a
ParSamples
object, with variables giving vectors for all the parameters, for example samples.getParams().name1 would be the vector of samples with name ‘name1’- Returns
A
ParSamples
object containing all the parameter vectors, with attributes given by the parameter names
- getSeparateChains() List[WeightedSamples] [source]¶
Gets a list of samples for separate chains. If the chains have already been combined, uses the stored sample offsets to reconstruct the array (generally no array copying)
- Returns
The list of
WeightedSamples
for each chain.
- loadChains(root, files_or_samples: Sequence, weights=None, loglikes=None, ignore_lines=None)[source]¶
Loads chains from files.
- Parameters
root – Root name
files_or_samples – list of file names or list of arrays of samples, or single array of samples
weights – if loading from arrays of samples, corresponding list of arrays of weights
loglikes – if loading from arrays of samples, corresponding list of arrays of -log(likelihood)
ignore_lines – Amount of lines at the start of the file to ignore, None not to ignore any
- Returns
True if loaded successfully, False if none loaded
- makeSingle()[source]¶
Combines separate chains into one samples array, so self.samples has all the samples and this instance can then be used as a general
WeightedSamples
instance.- Returns
self
- removeBurnFraction(ignore_frac)[source]¶
Remove a fraction of the samples as burn in
- Parameters
ignore_frac – fraction of sample points to remove from the start of the samples, or each chain if not combined
- saveAsText(root, chain_index=None, make_dirs=False)[source]¶
Saves the samples as text files, including parameter names as .paramnames file.
- Parameters
root – The root name to use
chain_index – Optional index to be used for the filename, zero based, e.g. for saving one of multiple chains
make_dirs – True if this should (recursively) create the directory if it doesn’t exist
- savePickle(filename)[source]¶
Save the current object to a file in pickle format
- Parameters
filename – The file to write to
- saveTextMetadata(root)[source]¶
Saves metadata about the sames to text files with given file root
- Parameters
root – root file name
- setParamNames(names=None)[source]¶
Sets the names of the params.
- Parameters
names – Either a
ParamNames
object, the name of a .paramnames file to load, a list of name strings, otherwise use default names (param1, param2…).
- setParams(obj)[source]¶
Adds array variables obj.name1, obj.name2 etc., where obj.name1 is the vector of samples with name ‘name1’
if a parameter name is of the form aa.bb.cc, it makes subobjects so that you can reference obj.aa.bb.cc. If aa.bb and aa are both parameter names, then aa becomes obj.aa.value.
- Parameters
obj – The object instance to add the parameter vectors variables
- Returns
The obj after alterations.
- updateBaseStatistics()[source]¶
Updates basic computed statistics for this chain, e.g. after any changes to the samples or weights
- Returns
self after updating statistics.
- class getdist.chains.ParSamples[source]¶
An object used as a container for named parameter sample arrays
- class getdist.chains.ParamConfidenceData(paramVec, norm, indexes, cumsum)¶
Create new instance of ParamConfidenceData(paramVec, norm, indexes, cumsum)
- property cumsum¶
Alias for field number 3
- property indexes¶
Alias for field number 2
- property norm¶
Alias for field number 1
- property paramVec¶
Alias for field number 0
- exception getdist.chains.WeightedSampleError[source]¶
An exception that is raised when a WeightedSamples error occurs
- class getdist.chains.WeightedSamples(filename=None, ignore_rows=0, samples=None, weights=None, loglikes=None, name_tag=None, label=None, files_are_chains=True, min_weight_ratio=1e-30)[source]¶
WeightedSamples is the base class for a set of weighted parameter samples
- Variables
weights – array of weights for each sample (default: array of 1)
loglikes – array of -log(Likelihoods) for each sample (default: array of 0)
samples – n_samples x n_parameters numpy array of parameter values
n – number of parameters
numrows – number of samples positions (rows in the samples array)
name_tag – name tag for the samples
- Parameters
filename – A filename of a plain text file to load from
ignore_rows –
if int >=1: The number of rows to skip at the file in the beginning of the file
if float <1: The fraction of rows to skip at the beginning of the file
samples – array of parameter values for each sample, passed to
setSamples()
weights – array of weights
loglikes – array of -log(Likelihood)
name_tag – The name of this instance.
label – latex label for these samples
files_are_chains – use False if the samples file (filename) does not start with two columns giving weights and -log(Likelihoods)
min_weight_ratio – remove samples with weight less than min_weight_ratio times the maximum weight
- changeSamples(samples)[source]¶
Sets the samples without changing weights and loglikes.
- Parameters
samples – The samples to set
- confidence(paramVec, limfrac, upper=False, start=0, end=None, weights=None)[source]¶
Calculate sample confidence limits, not using kernel densities just counting samples in the tails
- Parameters
paramVec – array of parameter values or int index of parameter to use
limfrac – fraction of samples in the tail, e.g. 0.05 for a 95% one-tail limit, or 0.025 for a 95% two-tail limit
upper – True to get upper limit, False for lower limit
start – Start index for the vector to use
end – The end index, use None to go all the way to the end of the vector.
weights – numpy array of weights for each sample, by default self.weights
- Returns
confidence limit (parameter value when limfac of samples are further in the tail)
- cool(cool)[source]¶
Cools the samples, i.e. multiplies log likelihoods by cool factor and re-weights accordingly
- Parameters
cool – cool factor
- corr(pars=None)[source]¶
Get the correlation matrix
- Parameters
pars – If specified, list of parameter vectors or int indices to use
- Returns
The correlation matrix.
- cov(pars=None, where=None)[source]¶
Get parameter covariance
- Parameters
pars – if specified, a list of parameter vectors or int indices to use
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
The covariance matrix
- deleteFixedParams()[source]¶
Removes parameters that do not vary (are the same in all samples)
- Returns
tuple (list of fixed parameter indices that were removed, fixed values)
- filter(where)[source]¶
Filter the stored samples to keep only samples matching filter
- Parameters
where – list of sample indices to keep, or boolean array filter (e.g. x>5 to keep only samples where x>5)
- getAutocorrelation(paramVec, maxOff=None, weight_units=True, normalized=True)[source]¶
Gets auto-correlation of an array of parameter values (e.g. for correlated samples from MCMC)
By default, uses weight units (i.e. standard units for separate samples from original chain). If samples are made from multiple chains, neglects edge effects.
- Parameters
paramVec – an array of parameter values, or the int index of the parameter in stored samples to use
maxOff – maximum autocorrelation distance to return
weight_units – False to get result in sample point (row) units; weight_units=False gives standard definition for raw chains
normalized – Set to False to get covariance (note even if normalized, corr[0]<>1 in general unless weights are unity).
- Returns
zero-based array giving auto-correlations
- getCorrelationLength(j, weight_units=True, min_corr=0.05, corr=None)[source]¶
Gets the auto-correlation length for parameter j
- Parameters
j – The index of the parameter to use
weight_units – False to get result in sample point (row) units; weight_units=False gives standard definition for raw chains
min_corr – specifies a minimum value of the autocorrelation to use, e.g. where sampling noise is typically as large as the calculation
corr – The auto-correlation array to use, calculated internally by default using
getAutocorrelation()
- Returns
the auto-correlation length
- getCorrelationMatrix()[source]¶
Get the correlation matrix of all parameters
- Returns
The correlation matrix
- getCov(nparam=None, pars=None)[source]¶
Get covariance matrix of the parameters. By default, uses all parameters, or can limit to max number or list.
- Parameters
nparam – if specified, only use the first nparam parameters
pars – if specified, a list of parameter indices (0,1,2..) to include
- Returns
covariance matrix.
- getEffectiveSamples(j=0, min_corr=0.05)[source]¶
Gets effective number of samples N_eff so that the error on mean of parameter j is sigma_j/N_eff
- Parameters
j – The index of the param to use.
min_corr – the minimum value of the auto-correlation to use when estimating the correlation length
- getEffectiveSamplesGaussianKDE(paramVec, h=0.2, scale=None, maxoff=None, min_corr=0.05)[source]¶
Roughly estimate an effective sample number for use in the leading term for the MISE (mean integrated squared error) of a Gaussian-kernel KDE (Kernel Density Estimate). This is used for optimizing the kernel bandwidth, and though approximate should be better than entirely ignoring sample correlations, or only counting distinct samples.
Uses fiducial assumed kernel scale h; result does depend on this (typically by factors O(2))
For bias-corrected KDE only need very rough estimate to use in rule of thumb for bandwidth.
In the limit h-> 0 (but still >0) answer should be correct (then just includes MCMC rejection duplicates). In reality correct result for practical h should depend on shape of the correlation function.
If self.sampler is ‘nested’ or ‘uncorrelated’ return result for uncorrelated samples.
- Parameters
paramVec – parameter array, or int index of parameter to use
h – fiducial assumed kernel scale.
scale – a scale parameter to determine fiducial kernel width, by default the parameter standard deviation
maxoff – maximum value of auto-correlation length to use
min_corr – ignore correlations smaller than this auto-correlation
- Returns
A very rough effective sample number for leading term for the MISE of a Gaussian KDE.
- getEffectiveSamplesGaussianKDE_2d(i, j, h=0.3, maxoff=None, min_corr=0.05)[source]¶
Roughly estimate an effective sample number for use in the leading term for the 2D MISE. If self.sampler is ‘nested’ or ‘uncorrelated’ return result for uncorrelated samples.
- Parameters
i – parameter array, or int index of first parameter to use
j – parameter array, or int index of second parameter to use
h – fiducial assumed kernel scale.
maxoff – maximum value of auto-correlation length to use
min_corr – ignore correlations smaller than this auto-correlation
- Returns
A very rough effective sample number for leading term for the MISE of a Gaussian KDE.
- getMeans(pars=None)[source]¶
Gets the parameter means, from saved array if previously calculated.
- Parameters
pars – optional list of parameter indices to return means for
- Returns
numpy array of parameter means
- getSignalToNoise(params, noise=None, R=None, eigs_only=False)[source]¶
Returns w, M, where w is the eigenvalues of the signal to noise (small y better constrained)
- Parameters
params – list of parameters indices to use
noise – noise matrix
R – rotation matrix, defaults to inverse of Cholesky root of the noise matrix
eigs_only – only return eigenvalues
- Returns
w, M, where w is the eigenvalues of the signal to noise (small y better constrained)
- get_norm(where=None)[source]¶
gets the normalization, the sum of the sample weights: sum_i w_i
- Parameters
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
normalization
- initParamConfidenceData(paramVec, start=0, end=None, weights=None)[source]¶
Initialize cache of data for calculating confidence intervals
- Parameters
paramVec – array of parameter values or int index of parameter to use
start – The sample start index to use
end – The sample end index to use, use None to go all the way to the end of the vector
weights – A numpy array of weights for each sample, defaults to self.weights
- Returns
ParamConfidenceData
instance
- mean(paramVec, where=None)[source]¶
Get the mean of the given parameter vector.
- Parameters
paramVec – array of parameter values or int index of parameter to use
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
parameter mean
- mean_diff(paramVec, where=None)[source]¶
Calculates an array of differences between a parameter vector and the mean parameter value
- Parameters
paramVec – array of parameter values or int index of parameter to use
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
array of p_i - mean(p_i)
- mean_diffs(pars: Union[None, int, Sequence] = None, where=None) Sequence [source]¶
Calculates a list of parameter vectors giving distances from parameter means
- Parameters
pars – if specified, list of parameter vectors or int parameter indices to use
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
list of arrays p_i-mean(p-i) for each parameter
- random_single_samples_indices(random_state=None, thin: Optional[float] = None, max_samples: Optional[int] = None)[source]¶
Returns an array of sample indices that give a list of weight-one samples, by randomly selecting samples depending on the sample weights
- Parameters
random_state – random seed or Generator
thin – additional thinning factor (>1 to get fewer samples)
max_samples – optional parameter to thin to get a specified mean maximum number of samples
- Returns
array of sample indices
- removeBurn(remove=0.3)[source]¶
removes burn in from the start of the samples
- Parameters
remove – fraction of samples to remove, or if int >1, the number of sample rows to remove
- reweightAddingLogLikes(logLikes)[source]¶
Importance sample the samples, by adding logLike (array of -log(likelihood values)) to the currently stored likelihoods, and re-weighting accordingly, e.g. for adding a new data constraint
- Parameters
logLikes – array of -log(likelihood) for each sample to adjust
- saveAsText(root, chain_index=None, make_dirs=False)[source]¶
Saves the samples as text files
- Parameters
root – The root name to use
chain_index – Optional index to be used for the samples’ filename, zero based, e.g. for saving one of multiple chains
make_dirs – True if this should create the directories if necessary.
- setColData(coldata, are_chains=True)[source]¶
Set the samples given an array loaded from file
- Parameters
coldata – The array with columns of [weights, -log(Likelihoods)] and sample parameter values
are_chains – True if coldata starts with two columns giving weight and -log(Likelihood)
- setDiffs()[source]¶
saves self.diffs array of parameter differences from the y, e.g. to later calculate variances etc.
- Returns
array of differences
- setMeans()[source]¶
Calculates and saves the means of the samples
- Returns
numpy array of parameter means
- setMinWeightRatio(min_weight_ratio=1e-30)[source]¶
Removes samples with weight less than min_weight_ratio times the maximum weight
- Parameters
min_weight_ratio – minimum ratio to max to exclude
- setSamples(samples, weights=None, loglikes=None, min_weight_ratio=None)[source]¶
Sets the samples from numpy arrays
- Parameters
samples – The sample values, n_samples x n_parameters numpy array, or can be a list of parameter vectors
weights – Array of weights for each sample. Defaults to 1 for all samples if unspecified.
loglikes – Array of -log(Likelihood) values for each sample
min_weight_ratio – remove samples with weight less than min_weight_ratio of the maximum
- std(paramVec, where=None)[source]¶
Get the standard deviation of the given parameter vector.
- Parameters
paramVec – array of parameter values or int index of parameter to use
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
parameter standard deviation.
- thin(factor: int)[source]¶
Thin the samples by the given factor, giving set of samples with unit weight
- Parameters
factor – The factor to thin by
- thin_indices(factor, weights=None)[source]¶
Indices to make single weight 1 samples. Assumes integer weights.
- Parameters
factor – The factor to thin by, should be int.
weights – The weights to thin, None if this should use the weights stored in the object.
- Returns
array of indices of samples to keep
- static thin_indices_and_weights(factor, weights)[source]¶
Returns indices and new weights for use when thinning samples.
- Parameters
factor – thin factor
weights – initial weight (counts) per sample point
- Returns
(unique index, counts) tuple of sample index values to keep and new weights
- twoTailLimits(paramVec, confidence)[source]¶
Calculates two-tail equal-area confidence limit by counting samples in the tails
- Parameters
paramVec – array of parameter values or int index of parameter to use
confidence – confidence limit to calculate, e.g. 0.95 for 95% confidence
- Returns
min, max values for the confidence interval
- var(paramVec, where=None)[source]¶
Get the variance of the given parameter vector.
- Parameters
paramVec – array of parameter values or int index of parameter to use
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
parameter variance
- weighted_sum(paramVec, where=None)[source]¶
Calculates the weighted sum of a parameter vector, sum_i w_i p_i
- Parameters
paramVec – array of parameter values or int index of parameter to use
where – if specified, a filter for the samples to use (where x>=5 would mean only process samples with x>=5).
- Returns
weighted sum
- getdist.chains.chainFiles(root, chain_indices=None, ext='.txt', separator='_', first_chain=0, last_chain=-1, chain_exclude=None)[source]¶
Creates a list of file names for samples given a root name and optional filters
- Parameters
root – Root name for files (no extension)
chain_indices – If True, only indexes inside the list included, If False, includes all indexes.
ext – extension for files
separator – separator character used to indicate chain number (usually _ or .)
first_chain – The first index to include.
last_chain – The last index to include.
chain_exclude – A list of indexes to exclude, None to include all
- Returns
The list of file names
- getdist.chains.covToCorr(cov, copy=True)[source]¶
Convert covariance matrix to correlation matrix
- Parameters
cov – The covariance matrix to work on
copy – True if we shouldn’t modify the input matrix, False otherwise.
- Returns
correlation matrix
- getdist.chains.findChainFileRoot(chain_dir, root, search_subdirectories=True)[source]¶
Finds chain files with name root somewhere under chain_dir directory tree. root can also be a relative path relaqtive to chain_dir, or have leading directories as needed to make unique
- Parameters
chain_dir – root directory of hierarchy of directories to look in
root – root name for the chain
search_subdirectories – recursively look in subdirectories under chain_dir
- Returns
full path and root if found, otherwise None
- getdist.chains.getSignalToNoise(C, noise=None, R=None, eigs_only=False)[source]¶
Returns w, M, where w is the eigenvalues of the signal to noise (small y better constrained)
- Parameters
C – covariance matrix
noise – noise matrix
R – rotation matrix, defaults to inverse of Cholesky root of the noise matrix
eigs_only – only return eigenvalues
- Returns
eigenvalues and matrix