Calculate Mean Squared Displacement¶
A particle’s mean squared displacement (MSD) tells you the squared distance it moves over time. By analyzing the shape of the MSD curve, we can gain mechanistic insight. The general form of the MSD curve is:
where \(K_{\alpha}\) is the generalized diffusion coefficient. In cases were \(\alpha=1\), the MSD is linear and \(K_{\alpha}\) is equal to the standard diffusion coefficient measured for particles undergoing Brownian motion. Albert Einstein was the first to prove that the MSD of a Brownian particle is linear and that its slope is proportional to the diffusion coefficient \(D\):
where \(N\) is the number of spatial dimensions over which the MSD was measured. When \(\alpha > 1\) diffusion occurs faster than expected and is termed superdiffusion. When \(\alpha < 1\), diffusion is slower than expected and called subdiffusion.
There are two ways one can calculate MSD. The first and simplest way is the ensemble MSD:
The ensemble MSD measures the distance traveled from a particle’s initial position. The second way measures and averages the particle’s displacement over all possible time lags, \(\tau\), and is called the time-averaged MSD:
where \(T\) is the length of the trajectory. The time-averaged MSD is more statistically robust than the ensemble average. For ergodic systems, both methods of calculation will yield the same result. However, with less data, the time-averaged MSD will give tighter error bars.
msd.py can calculate both types of MSDs for a particle. If you believe
the particle undergoes Brownian motion, then you can also measure the diffusion
constant. The script can be run from the command line
or its classes can be imported for use in other analysis
scripts. Both usages are explained below.
Command Line Interface¶
Calculate mean squared displacement (MSD) and diffusion coefficientfor a specific residue or set of atoms in a trajectory
usage: msd.py [-h] [-t TRAJECTORY] [-g GRO] [-r RESIDUE]
[-atoms ATOMS [ATOMS ...]] [-b NBOOT] [-f FRONTFRAC]
[-F FRACSHOW] [-a AXIS] [-nboot NBOOT] [-ensemble] [-compare]
[-power_law] [-tails] [-pores] [-pr PORE_RADIUS] [-acf] [-acov]
[-nofit] [--update] [-wt WT_WATER] [-s SAVENAME] [-begin BEGIN]
[-end END]
Named Arguments¶
| -t, --trajectory | |
Path to input file Default: “wiggle.trr” | |
| -g, --gro | Name of .gro coordinate file Default: “wiggle.gro” |
| -r, --residue | Name of residue whose diffusivity we want |
| -atoms | Name of atoms whose collective diffusivity is desired |
| -b, --nboot | Number of bootstrap trials to be run Default: 200 |
| -f, --frontfrac | |
Where to start fitting line on msd curve Default: 0 | |
| -F, --fracshow | Percent of graph to show, also where to stop fitting line during diffusivity calculation Default: 0.2 |
| -a, --axis | Which axis to compute msd along Default: “z” |
| -nboot | Number of bootstrap trials for error estimation Default: 200 |
| -ensemble, --ensemble | |
Calculate MSD as ensemble average Default: False | |
| -compare, --compare | |
Compare time-averaged and ensemble-averaged time series Default: False | |
| -power_law, --power_law | |
Fit MSD to a power law Default: False | |
| -tails, --tails | |
Only look at transport within tails Default: False | |
| -pores, --pores | |
Only look at transport within pores Default: False | |
| -pr, --pore_radius | |
Radius of pores. Anything greater than this distance from the pore center will not be included in calculation Default: 1.5 | |
| -acf, --autocorrelation | |
Plot step autocorrelation function Default: False | |
| -acov, --autocovariance | |
Plot step autocovariance function Default: False | |
| -nofit, --nofit | |
Do not attempt to fit any curve to MSD Default: False | |
| --update | update database with MD MSD values Default: False |
| -wt, --wt_water | |
Weight percent water in system. Only needto adjust this if updating database. Default: 10 | |
| -s, --savename | If specified, save the MSD curve in a pickled .pl file of this name. Default: False |
| -begin, --begin | |
First frame to analyze Default: 0 | |
| -end, --end | Last frame to analyze Default: -1 |
Classes¶
-
class
msd.Diffusivity(traj, gro, axis, begin=0, end=-1, startfit=0.01, endfit=1, residue=False, atoms=(), restrict=(), solute_indices=())¶ -
__init__(traj, gro, axis, begin=0, end=-1, startfit=0.01, endfit=1, residue=False, atoms=(), restrict=(), solute_indices=())¶ Calculate the mean squared displacement of a selection from an unwrapped coordinate trajectory
Parameters: - traj – unwrapped trajectory (i.e. gmx trjconv with -pbc nojump)
- gro – representative coordinate file
- axis – axis along which to compute MSD (choices: x, y, z, xy, yz, xz, xyz)
- startfit – start linear fit to MSD startfit % into trajectory
- endfit – end linear fit to MSD endfit % into trajectory
- residue – if specified, the residue whose center of mass MSD will be measured
- atoms – if specified, group of atoms whose center of mass MSD will be measured
- restrict – restrict selection to certain indices. For example, if you want to calculate MSD of a certain residue, but only include a fraction of the total residues in the system.
- solute_indices – different from restrict, this specifies the indices of the solute centers of mass of
which to calculate the MSDs. For example, if there are 24 solutes and you only want the MSD of the first two solutes, you would pass [0, 1].
-
bootstrap(N, fit_line=True, confidence=68)¶ Estimate error at each point in the MSD curve using bootstrapping
Parameters: - N (int) – number of bootstrap trials
- fit_line (bool) – Fit a line to the MSD curve
- confidence (float) – percent confidence interval (out of 100)
-
bootstrap_power_law(N)¶ Bootstrap fits of a power law to the MSD
Parameters: N (int) – number of bootstrap trials
-
calculate(ensemble=False)¶ Calculate mean squared displacement of trajectory
Parameters: ensemble (bool) – Calculate the ensemble MSD instead of the time-averaged MSD
-
fit_linear()¶ Fit a line to the MSD curve. This function will prompt the user to define the linear region as there is no rule of thumb for determining exactly where to fit the MSD curve. Make sure to report what you chose as the linear region.
-
fit_power_law(y)¶ Fit power law to MSD curve.
Parameters: y (numpy.ndarray) – y values of MSD curve Returns: Coefficient and exponent in power low of form [coefficient, power]
-
plot(fracshow=0.5, save=False, savename='diffusivity.pdf', savedata=False, show=False)¶ Plot the MSD curve
Parameters: - fracshow (float) – fraction of MSD curve to show on plot
- save (bool) – save plot
- savename (str) – name under which to save plot (including file extension)
- savedata (bool) – save MSD curve data in a numpy compressed .npz file
- show (bool) – show plot when done
-
plot_autocorrelation(show=True)¶ Plot autocorrelation function
Parameters: show (bool) – show plot Returns:
-
restrict_to_pore(r, dwell_fraction=0.95, tails=False, build_monomer='NAcarb11V', spline=False, buffer=0, npores=4)¶ Restrict calculations to center of masses (COMs) that primarily stay in the pore OR tail region
Parameters: - r (float) – radius of pore. Anything greater than r from the pore center is considered the tail region
- dwell_fraction – Fraction of time spent in region of interest required in order to keep trajectory
- tails (bool) – if True, then restrict calculations to COMs primarily in the tail region
- build_monomer (str) – monomer coordinate file of which liquid crystal membrane is mode
- spline (bool) – track pore centers with a 3D spline
- buffer (float) – Do not count molecules below _buffer_ or above z-box-vector - _buffer_
-
stationarity(axis=2)¶ Determine if increments (resulting from first-order differencing) are stationary using the Augmented Dickey Fuller test. The null hypothesis of the ADF test is that the time series has a unit root, i.e. it is non-stationary. If the returned p-value is below some threshold (often 0.05), then one can reject the null hypothesis and claim stationarity. The p-value is calculated from the ADF statistic. A more negative ADF statistic indicates a greater chance of rejecting the null hypothesis.
Parameters: axis (int) – axis along which to measure increments. (x = 0, y = 1, z = 2) Returns: p value for each trajectory Return type: numpy.ndarray
-
step_autocorrelation()¶ Calculate autocorrelation of step length in the direction of axis.
-