Time Series Analysis¶
from LLC_Membranes.llclib import timeseries
Commonly used functions for working with time series.
Classes¶
-
class
timeseries.VectorAutoRegression(timeseries, r)¶ -
__init__(timeseries, r)¶ Fit a vector autogressive (VAR) process to data using statsmodels.tsa.vector_ar. The output object is just reduction and renaming of attributes produced after running the fit() method of the VAR class
For more detailed docs, see: https://www.statsmodels.org/dev/vector_ar.html#module-statsmodels.tsa.vector_ar
For a multidimensional time series, one could write a system of dependent autoregressive equations:
\[Y_t = A_1*Y_{t-1} + ... + A_p*Y_{t-p} + u_t\]where
\[\begin{split}Y_t = \begin{bmatrix} y_{1,t} \\ y_{2,t} \\ ... \\ y_{k,t} \end{bmatrix}, Y_{t-1} = \begin{bmatrix} y_{1,t-1} \\ y_{2,t-1} \\ ... \\ y_{k,t-1} \end{bmatrix}, ...\end{split}\]The matrices \(A_i\) are K x K matrices where K is the number of dimensions of the trajectory. \(A_1\) contains the 1st time lag autoregressive coefficients. If
\[\begin{split}A_1 = \begin{bmatrix} 0.5 & 0 \\ 0 & 0.4 \end{bmatrix}\end{split}\]the associated system of equations for a VAR(1) process would be:
\[ \begin{align}\begin{aligned}y_{1,t} = 0.5y_{1,t-1} + u_{1,t}\\y_{2,t} = 0.4y_{2, t-1} + u_{2,t}\end{aligned}\end{align} \]Of course, adding cross-terms to A would create more complex dynamical behavior
\(u_t\) is a K-dimensional vector multivariate gaussian noise generated on the covariance matrix of the data
Parameters: - timeseries (numpy.ndarray) – a T x K matrix where T is the number of observations and K is the number of variables/dimension
- r (int) – autoregressive order. Number of past point on which current point depends
-
Functions¶
-
timeseries.acf_slow(d)¶ Calculate the autocorrelation function of a time series. This speed of this method is O(n^2)
Parameters: d – numpy array of length n, with time series values {x1, x2 … xn} Returns: autocorrelation function
-
timeseries.acf(t, largest_prime=500, autocov=False)¶ Quickly calculated the autocorrelation function of a time series, t. This gives the same results as acf_slow() but uses FFTs. This method is faster than numpy.correlate.
Parameters: t – time series array (npoints, nseries) :param largest_prime : the largest prime factor of array length allowed. The smaller the faster. 1.6M points takes about 5 seconds with largest_prime=1000. Just be aware that you are losing data by truncating. But 5-6 data points isn’t a big deal for large arrays. :param autocov: return autocovariance function insted (which is just the unnormalized autocorrelation)
-
timeseries.autocov(joint_distribution, varied_length=False)¶ Calculate the autocovariance function of the joint distribution of multiple realizations of a time series model
See Pag 45 - 46 of Time Series Analysis (1st edition?) by James hamilton
y_t : timeseries values at time t y_t-j : timeseries values at time t - j
covariance_j = E(y_t - mu)(y_t-j - mu)
In words: the covariance at lag j equals the expected value of y_t times y_t-j. They are not necessarily independent so you can’t assume it equals E(y_t)*E(y_t-j)
Parameters: joint_distribution – n x m numpy array with n independent realizations of a time series consisting of m data points (observations) per realization.
:returns autocovariance of joint distribution as function of lag j
-
timeseries.msd_straightforward(x, axis)¶ Straightforward way to calculte msd. Gives same answer as msd() :param x: positions of centers of mass of all particles for each frame, numpy array [nframes, natoms, dim] :param ndx: list of indices to include in msd calculation (x = 0, y = 1, z = 2)
Returns: Average MSD and individual particle MSDs
-
timeseries.msd(x, axis, ensemble=False, nt=1)¶ Calculate mean square displacement based on particle positions
Parameters: - x (ndarray (n_frames, n_particles, 3)) – particle positions
- axis (int or list of ints) – axis along which you want MSD (0, 1, 2, [0, 1], [0, 2], [1, 2], [0, 1, 2])
- ensemble (bool) – if True, calculate the ensemble MSD instead of the time-averaged MSD
Returns: MSD of each particle
-
timeseries.bootstrap_msd(msds, N, confidence=68, median=False)¶ Estimate error at each point in the MSD curve using bootstrapping
Parameters: - msds (np.ndarray) – mean squared discplacements to sample
- N (int) – number of bootstrap trials
- confidence (float) – percentile for error calculation
-
timeseries.step_autocorrelation(trajectories, axis=0)¶ Calculate autocorrelation of step length and direction
Parameters: - trajectories (numpy.ndarray) – array of position vs time (n_frames, n_particles, n_dimensions)
- axis (int or list) – axis along which to calculate step lengths ({x:0, y:1, z:2})
-
timeseries.correlograms(zt)¶ Plot correlograms of (z - zmean), (z - zmean)^2, (z - zmean)^3, (z - zmean)^4 :param zt: timeseries of probability integral transforms
-
timeseries.switch_points(sequence)¶ Determine points in discrete state time series where switches between states occurs. NOTE: includes first and last point of time series
Parameters: sequence (list or numpy.ndarray) – series of discrete states Returns: list of indices where swithces between states occur Return type: numpy.ndarray
-
timeseries.calculate_moving_average(series, n)¶ Calculate moving average of a time series
Parameters: n (int) – Number of previous points to average