-
Notifications
You must be signed in to change notification settings - Fork 815
Description
Summary
Dear MDAnalysis community, I would like to propose a new analysis routine that I have developed, ResidenceTime, for MDAnalysis. The routine would allow to analyze how long specific groups of molecules (called "probe groups") stay within a defined distance from another group of molecules (called "reference groups"). By calculating these "residence times," the tool helps quantify solvation dynamics, binding interactions, and local environment stability over time.
I couldn't find this routine among the analysis available in MDAnalysis (hopefully I didn't miss it). The main metric it computes is:
- Residence Time Distribution:
Histograms (both raw and normalized) of residence times with customizable bin sizes or counts.
Additionally, the routine computes the Average Coordination Number as a function of time as a secondary analysis. This metric is a natural byproduct of the residence time calculations, as it leverages the same underlying data (probes within the cutoff distance). It tracks how many probe groups stay near the reference groups over time, offering additional insight into coordination dynamics.
Motivation
I developed this routine to assess the strength of coordination of solvent molecules around ions, but it can also be used to study other dynamic molecular interactions, such as ligand binding or protein-ligand residence times. This work is linked to an upcoming publication where I demonstrate its utility in understanding solvation dynamics.
Key Features
-
Inputs:
reference_group: Atoms to calculate residence times around.probe_group: Atoms whose residence times are measured.radius: Cutoff distance in Ångstroms.- Optional binning: Choose either
bin_sizeorbin_countfor the histogram.
-
Outputs:
results.histogram: Dictionary containing raw and normalized histograms, and bin edges.results.coordination: Time series data for average coordination numbers.
-
Example Use Cases:
- Analyze how long water sticks to ions.
- Track ion coordination in solvation shells or binding sites.
Parallelization Potential
The routine is designed to be parallelization-friendly. Two approaches could be used to compute the residence time distribution, but I've implemented the second version described here:
-
Step-by-Step Approach (Not Used Here): This method involves analyzing one frame at a time, where the computation of a frame depends on the results of the previous frame to check if a probe molecule is still within the cutoff from the reference. While straightforward in concept, this dependency makes it challenging to divide the workload across multiple processors, limiting its scalability for parallel computation.
-
Global Monitoring (Implemented Here):
This implementation collects residence data for all probes in one pass and processes it later. This setup could be chunked for trajectory-based parallelization, making it scalable for large systems. While speculative for now, this design leaves the door open for parallel implementation.
Bonus Ideas: Fitting Residence Time Histograms, Probabilistic Criteria for Coordination and Grace time
Two additional features could enhance the utility of this routine in the future:
-
Fitting Residence Time Histograms:
Residence time histograms can be fitted with exponential or biexponential decay functions to extract solvation/desolvation timescales. For example:
P(t) = A e^{-t/\tau}
or
P(t) = A_1 e^{-t/\tau_1} + A_2 e^{-t/\tau_2}
where ( \tau ) represents the characteristic residence time for single processes, and ( \tau_1 ) and ( \tau_2 ) represent distinct timescales for systems with multiple processes. The biexponential fit helps identify fast and slow dynamics in complex systems. While not implemented yet, this feature could serve as a valuable post-processing guide for users. -
Probabilistic Criteria for Coordination:
Instead of using a fixed cutoff to determine coordination, a probabilistic approach could be implemented. This might involve using radial distribution functions (RDFs) or Boltzmann distributions to account for temperature and system-specific variations. Such an approach could provide a more nuanced assessment of coordination and may align better with experimental observations. -
Grace time:
Alternatively or complementary, in the fixed cutoff version a grace time can be included to account for probes exiting the solvation shell for very short times.
Example Code
Example Script:
You can find a script for analyzing residence times and coordination numbers:
import MDAnalysis as mda
from MDAnalysis.analysis.rdt import ResidenceTime
import numpy as np
import matplotlib.pyplot as plt
u = mda.Universe('nvt.tpr', 'nvt.trr')
reference = u.select_atoms("resname NA")
probe = u.select_atoms("resname WAT")
rdt_analysis = ResidenceTime(reference, probe, 4, bin_count = 20)
rdt_analysis.run()
# Generate and save residence time distribution
results = rdt_analysis.results.histogram
raw_hist = results["raw_hist"]
bin_edges = results["bin_edges"]
normalized_hist = results["normalized_hist"]
# Compute bin midpoints
bin_midpoints = (bin_edges[:-1] + bin_edges[1:]) / 2
# Save histogram data to a single CSV file
data = np.column_stack((bin_edges[:-1], bin_edges[1:], bin_midpoints, raw_hist, normalized_hist))
np.savetxt(
"residence_times_histogram.csv",
data,
delimiter=",",
header="#Bin_Start (ps), Bin_End (ps), Bin_Midpoint (ps), Raw_Count, Normalized_Value",
fmt="%.3f,%.3f,%.3f,%d,%.6f",
comments=""
)
# Generate and save histogram plot
plt.figure(figsize=(8, 6))
plt.bar(bin_midpoints, normalized_hist, width=np.diff(bin_edges), align="center", edgecolor="black")
plt.xlabel("Residence Time (ps)")
plt.ylabel("Normalized Frequency")
plt.title("Residence Time Distribution")
plt.grid(True, linestyle="--", alpha=0.7)
plt.savefig("residence_times_histogram.png", dpi=300)
plt.show()Sample Outputs:
Let me know if you think this analysis routine could be a valuable addition to MDAnalysis. I have already developed the code and can launch a pull request for your review. I’d love to hear your feedback, suggestions, or ideas for improving it, and whether this aligns with the needs of the community.
Looking forward to your thoughts!
Mattia
