Statistical Analysis

from LLC_Membranes.llclib import stats

This library contains a relatively sparse number of functions for calculating some statistics. Most statistics are generated via bootstrapping for which the implementation is specific to the problem, so they are not generalized here.

Classes

class stats.Cdf(data)
__init__(data)

Generate an emperical cumulative distribution function from data

Parameters:data (list) – x-values of data in no particular order
cdf(x)

Callable cumulative emperical distribution function :param x: array of x-values at which to evaluate cumulative emperical distribution function :return:

random_sample(n=1)
Parameters:n – number of random samples to draw (default=1)
Returns:random samples
update_cdf(obs)

Add observation to the cdf (only works for single value right now)

Functions

stats.confidence_interval(data, confidence)

Calculate confidence interval of data.

Plot these errorbars with plt.fill_between() as follows: plt.fill_between(x, mean + error[1, :], mean - error[0, :])

Parameters:
  • data – array of data trajectories [n_trajectories, n_data_points]
  • confidence – percent confidence
Returns:

Upper and lower bounds to confidence intervals. Readily plotted with plt.errorbar

stats.outliers(data, alpha=0.01)

Check for outliers of viscosity calculation using Grubbs’ test Steps: (1) Calculate critical t-statistic https://stackoverflow.com/questions/19339305/python-function-to-get-the-t-statistic?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa (2) Calculate critical G https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h1.htm

Parameters:data – data to search for outliers

:param alpha : probability that point is falsely rejected (default=0.01)

Returns:indices of outliers