The sensitivity of molecular dynamics on changes in the potential energy function plays an important role in understanding the dynamics and function of complex molecules. We present a method to obtain path ensemble averages of a perturbed dynamics from a set of paths generated by a reference dynamics. It is based on the concept of path probability measure and the Girsanov theorem, a result from stochastic analysis to estimate a change of measure of a path ensemble. Since Markov state models (MSMs) of the molecular dynamics can be formulated as a combined phase-space and path ensemble average, the method can be extended to reweight MSMs by combining it with a reweighting of the Boltzmann distribution. We demonstrate how to efficiently implement the Girsanov reweighting in a molecular dynamics simulation program by calculating parts of the reweighting factor “on the fly” during the simulation, and we benchmark the method on test systems ranging from a two-dimensional diffusion process and an artificial many-body system to alanine dipeptide and valine dipeptide in implicit and explicit water. The method can be used to study the sensitivity of molecular dynamics on external perturbations as well as to reweight trajectories generated by enhanced sampling schemes to the original dynamics.
The broad substrate tolerance of tubulin tyrosine ligase is the basic
rationale behind its wide applicability for chemoenzymatic protein
functionalization. In this context, we report that the wild-type enzyme
enables ligation of various unnatural amino acids that are substantially
bigger than and structurally unrelated to the natural substrate, tyrosine,
without the need for extensive protein engineering. This unusual substrate
flexibility is due to the fact that the enzyme's catalytic pocket forms an
extended cavity during ligation, as confirmed by docking experiments and
all-atom molecular dynamics simulations. This feature enabled one-step
C-terminal biotinylation and fluorescent coumarin labeling of various
functional proteins as demonstrated with ubiquitin, an antigen binding
nanobody, and the apoptosis marker Annexin V. Its broad substrate
tolerance establishes tubulin tyrosine ligase as a powerful tool for in
vitro enzyme-mediated protein modification with single functional amino
acids in a specific structural context.
In this article we propose an adaptive importance sampling scheme for dynamical quantities of high dimensional complex systems which are metastable. The main idea of this article is to combine a method coming from Molecular Dynamics Simulation, Metadynamics, with a theorem from stochastic analysis, Girsanov’s theorem. The proposed algorithm has two advantages compared to a standard estimator of dynamic quantities: firstly, it is possible to produce estimators with a lower variance and, secondly, we can speed up the sampling. One of the main problems for building importance sampling schemes for metastable systems is to find the metastable region in order to manipulate the potential accordingly. Our method circumvents this problem by using an assimilated version of the Metadynamics algorithm and thus creates a non equilibrium dynamics which is used to sample the equilibrium quantities.
S-adenosyl-L-methionine (SAM) ligand binding induces major structural changes in SAM-I riboswitches, through which gene expression is regulated via transcription termination. Little is known about the conformations and motions governing the function of the full-length Bacillus subtilis yitJ SAM-I riboswitch. Therefore, we have explored its conformational energy landscape as a function of Mg2+ and SAM ligand concentrations using single-molecule Förster resonance energy transfer (smFRET) microscopy and hidden Markov modeling analysis. We resolved four conformational states both in the presence and the absence of SAM and determined their Mg2+-dependent fractional populations and conformational dynamics, including state lifetimes, interconversion rate coefficients and equilibration timescales. Riboswitches with terminator and antiterminator folds coexist, and SAM binding only gradually shifts the populations toward terminator states. We observed a pronounced acceleration of conformational transitions upon SAM binding, which may be crucial for off-switching during the brief decision window before expression of the downstream gene.
The core-set approach is a discretization method for Markov state models of complex molecular dynamics. Core sets are disjoint metastable regions in the conformational space, which need to be known prior to the construction of the core-set model. We propose to use density-based cluster algorithms to identify the cores. We compare three different density-based cluster algorithms: the CNN,the DBSCAN, and the Jarvis-Patrick algorithm. While the core-set models based on the CNN and DBSCAN clustering are well-converged, constructing core-set models based on the Jarvis-Patrick clustering cannot be recommended. In a well-converged core-set model, the number of core sets is up to an order of magnitude smaller than the number of states in a conventional Markov state model with comparable approximation error. Moreover, using the density-based clustering one can extend the core-set method to systems which are not strongly metastable. This is important for the practical application of the core-set method because most biologically interesting systems are only marginally metastable. The key point is to perform a hierarchical density-based clustering while monitoring the structure of the metric matrix which appears in the core-set method. We test this approach on a molecular-dynamics simulation of a highly flexible 14-residue peptide. The resulting core-set models have a high spatial resolution and can distinguish between conformationally similar yet chemically different structures, such as register-shifted hairpin structures.
Antigen uptake and processing by innate immune cells is crucial to initiate the immune response. Therein, the endocytic C-type lectin receptors recognize pathogens by their glycan structures enabling decision-making in innate immunity. Herein, we studied the carbohydrate recognition domain of Langerin, a C-type lectin receptor involved in the host defense against viruses such as HIV and influenza as well as bacteria and fungi. Using a combination of nuclear magnetic resonance and molecular dynamics simulations, we unraveled the molecular determinants underlying cargo capture and release encoded in the receptor architecture. Our findings revealed receptor dynamics over several timescale associated with binding and release of the essential cofactor Ca2+ controlled by the coupled motions of two loops. Moreover, mutual information theory and site-directed mutagenesis exposed an allosteric intra-domain network that modulates the Ca2+ affinity depending on the pH thereby promoting fast ligand release.
The membrane permeability of cyclic peptides and peptidomimetics, which are generally larger and more complex than typical drug molecules, is likely strongly influenced by the conformational behavior of these compounds in polar and apolar environments. The size and complexity of peptides often limit their bioavailability, but there are known examples of peptide natural products such as cyclosporin A (CsA) that can cross cell membranes by passive diffusion. CsA is an undecapeptide with seven methylated backbone amides. Its crystal structure shows a “closed” twisted β-pleated sheet conformation with four intramolecular hydrogen bonds that is also observed in NMR measurements of CsA in chloroform. When binding to its target cyclophilin, on the other hand, CsA adopts an “open” conformation without intramolecular hydrogen bonds. In this study, we attempted to sample the complete conformational space of CsA in chloroform and in water by molecular dynamics simulations in order to better understand its conformational behavior in these two environments and to rationalize the good membrane permeability of CsA observed experimentally. From 10 μs molecular dynamics simulations in each solvent, Markov state models were constructed to characterize the metastable conformational states. The model in chloroform is compared to nuclear Overhauser effect NMR spectroscopy data reported in this study and taken from the literature. The conformational landscapes in the two solvents show significant overlap but also clearly distinct features.
We present extensive all-atom Molecular Dynamics (MD) simulation data of the twenty encoded amino acids in explicit water, simulated with different force fields. The termini of the amino acids have been capped to ensure that the dynamics of the Φ and ψ torsion angles are analogues to the dynamics within a peptide chain. We use representatives of each of the four major force field families: AMBER ff-99SBILDN, AMBER ff-03, OPLS-AA/L, CHARMM27 and GROMOS43a1. Our data represents a library and test bed for method development for MD simulations and for force fields development. Part of the data set has been previously used for comparison of the dynamic properties of force fields (Vitalini et al., 2015)and for the construction of peptide basis functions for the variational approach to molecular kinetics.
Molecular-dynamics simulations are increasingly used to study dynamic properties of biological systems. With this development, the ability of force fields to successfully predict relaxation timescales and the associated conformational exchange processes moves into focus. We assess to what extent the dynamic properties of model peptides (Ac-A-NHMe, Ac-V-NHMe, AVAVA, A10) differ when simulated with different force fields (AMBER ff99SB-ILDN, AMBER ff03, OPLS-AA/L, CHARMM27, and GROMOS43a1). The dynamic properties are extracted using Markov state models. For single-residue models (Ac-A-NHMe, Ac-V-NHMe), the slow conformational exchange processes are similar in all force fields, but the associated relaxation timescales differ by up to an order of magnitude. For the peptide systems, not only the relaxation timescales, but also the conformational exchange processes differ considerably across force fields. This finding calls the significance of dynamic interpretations of molecular-dynamics simulations into question.
As an essential part of many biological processes, protein-protein interactions (PPIs) offer exciting
and promising opportunities for drug discovery by extension of the druggable target space. Over the
last decade, studies on protein networks have significantly increased the number of identified PPIs.
However, despite steadily-growing data on PPIs, detailed understanding of the interaction surfaces
and their dynamics remains limited. Furthermore, the development of small-molecule inhibitors of
PPIs faces technological challenges, leaving the question about the “druggability” of PPIs open.
Molecular dynamics (MD) simulations may facilitate the prediction of druggable binding sites on
protein-protein interfaces by detecting binding hot spots and transient pockets. MD allows for a
detailed analysis of structural and functional aspects of PPIs and thus provides valuable insights into
PPI mechanisms and supports the design of PPI modulators. We provide an overview on the main
areas of MD applications to PPIs including structural investigations and the design of PPI disruptors.
Emphasizing the beneficial synergies between computational and experimental techniques, MD
techniques are also frequently applied to low-resolution structural data and have been used to
elucidate structure and movements of complex macromolecular structures relevant for biological
Although Markov state models have proven to be powerful tools in resolving the complex features of biomolecular kinetics, the discretization of the conformational space has been a bottleneck since the advent of the method. A recently introduced variational approach, which uses basis functions instead of crisp conformational states, opened up a route to construct kinetic models in which the discretization error can be controlled systematically. Here, we develop and test a basis set for peptides to be used in the variational approach. The basis set is constructed by combining local residue-centered kinetic modes which are obtained from kinetic models of terminally blocked amino acids. Using this basis set, we model the conformational kinetics of two hexapeptides with sequences VGLAPG and VGVAPG. Six basis functions are sufficient to represent the slow kinetic modes of these peptides. The basis set also allows for a direct interpretation of the slow kinetic modes without an additional clustering in the space of the dominant eigenvectors. Morever, changes in the conformational kinetics due to the exchange of leucine in VGLAPG to valine in VGVAPG can be directly quantified by comparing histograms of the basis set expansion coefficients.
The eigenvalues and eigenvectors of the molecular dynamics propagator (or transfer operator) contain the essential information about the molecular thermodynamics and kinetics. This includes the stationary distribution, the metastable states, and state-to-state transition rates. Here we present a variational approach for computing these dominant eigenvalues and eigenvectors. This approach is analogous the variational approach used for computing stationary states in quantum mechanics. A corresponding method of linear variation is formulated. It is shown that the matrices needed for the linear variation method are correlation matrices that can be estimated from simple MD simulations for a given basis set. The method proposed here is thus, to first define a basis set able to capture the relevant conformational transitions, then compute the respective correlation matrices, and then to compute their dominant eigenvalues and eigenvectors, thus obtaining the key ingredients of the slow kinetics.
We have developed a hidden Markov model and optimization procedure for photon-based single-molecule FRET data, which takes into account the trace-dependent background intensities. This analysis technique reveals an unprecedented amount of detail in the folding kinetics of the Diels-Alderase ribozyme. We find a multitude of extended (low-FRET) and compact (high-FRET) states. Five states were consistently and independently found in two FRET constructs and three Mg2+ concentrations. Structures generally tend to become more compact upon addition of Mg2+. Some compact structures are found to significantly depend on Mg2+ concentration, suggesting a tertiary fold stabilized by Mg2+ ions. One compact structure was found to be Mg2+-independent, consistent with stabilization by tertiary Watson-Crick base pairing found in the folded Diels-Alderase structure. A hierarchy of timescales was found, including dynamics of 10 ms or faster, likely due to tertiary structure fluctuations, and slow dynamics on the seconds timescale, presumably associated with significant changes in secondary structure. The folding pathways proceed through a series of intermediate secondary structures. There exist both, compact pathways and more complex ones, which display tertiary unfolding, then secondary refolding and, subsequently, again tertiary refolding.
Understanding how the chemical environment modulates the predominant conformations and kinetics of flexible molecules is a core interest of biochemistry and a prerequisite for the rational design of synthetic catalysts. This study combines molecular dynamics simulation and Markov state models (MSMs) to a systematic computational strategy for investigating the effect of the chemical environment of a molecule on its conformations and kinetics. MSMs allow quantities to be computed that are otherwise difficult to access, such as the metastable sets, their free energies, and the relaxation time scales related to the rare transitions between metastable states. Additionally, MSMs are useful to identify observables that may act as sensors for the conformational or binding state of the molecule, thus guiding the design of experiments. In the present study, the conformation dynamics of UDP-GlcNAc are studied in vacuum, water, water + Mg2+, and in the protein UDP-GlcNAc 2-epimerase. It is found that addition of Mg2+ significantly affects the conformational stability, thermodynamics, and kinetics of UDP-GlcNAc. In particular, the slowest structural process, puckering of the GlcNAc sugar, depends on the overall conformation of UDP-GlcNAc and may thus act as a sensor of whether Mg2+ is bound or not. Interestingly, transferring the molecule from vacuum to water makes the protein-binding conformations UDP-GlcNAc first accessible, while adding Mg2+ further stabilizes them by specifically associating to binding-competent conformations. While Mg2+ is not cocrystallized in the UDP-GlcNAc 2-epimerase complex, the selectively stabilized Mg2+/UDP-GlcNAc complex may be a template for the bound state, and Mg2+ may accompany the binding-competent ligand conformation to the binding pocket. This serves as a possible explanation of the enhanced epimerization rate in the presence of Mg2+. This role of Mg2+ has previously not been described and opens the question whether “binding co-factors” may be a concept of general relevance for protein–ligand binding.
Molecular simulations of biomolecules often reveal a complex picture of the their kinetics, whereas kinetic experiments typically seem to indicate considerably simpler two- or three-state kinetics. Markov state models (MSM) provide a tool to link between simulation and experi- ment, and to resolve this apparent contradiction.
The equilibrium kinetics of biomolecules can be probed by techniques such as temperature-jump or fluorescence correlation spectroscopy. These measurements can be described by dynamical fingerprints, i.e., densities of relaxation timescales where each peak corresponds to an exponential relaxation process. In many cases, single- or double-peaked fingerprints are found, suggesting that a two- or three-state model may provide a satisfactory description of the biomolecule studied, while simulations often reveal a more complex picture with many kinetically relevant states. Here we sketch an approach combining Markov models of the simulated dynamics with dynamical fingerprints to link between simulation and experiment. This link sheds light on the relation between experimental setup and sensitivity of the experiment to particular kinetic processes. Furthermore, our approach can be used to design experiments such that specific processes appear with large amplitudes.This is illustrated by reviewing recent results from the analysis of the fluorescent 18-mer peptide MR121-(GS)9-W.
Markov state models of molecular kinetics (MSMs), in which the long-time statistical dynamics of a molecule is approximated by a Markov chain on a discrete partition of configuration space, have seen widespread use in recent years. This approach has many appealing characteristics compared to straightforward molecular dynamics simulation and analysis, including the potential to mitigate the sampling problem by extracting long-time kinetic information from short trajectories and the ability to straightforwardly calculate expectation values and statistical uncertainties of various stationary and dynamical molecular observables. In this paper, we summarize the current state of the art in generation and validation of MSMs and give some important new results. We describe an upper bound for the approximation error made by modelingmolecular dynamics with a MSM and we show that this error can be made arbitrarily small with surprisingly little effort. In contrast to previous practice, it becomes clear that the best MSM is not obtained by the most metastable discretization, but the MSM can be much improved if non-metastable states are introduced near the transition states. Moreover, we show that it is not necessary to resolve all slow processes by the state space partitioning, but individual dynamical processes of interest can be resolved separately. We also present an efficient estimator for reversible transition matrices and a robust test to validate that a MSM reproduces the kinetics of the molecular dynamics data.
Markov (state) models (MSMs) have attracted a lot of interest recently as they (1) can probe long-term molecular kinetics based on short-time simulations, (2) offer a way to analyze great amounts of simulation data with relatively little subjectivity of the analyst, (3) provide insight into microscopic quantities such as the ensemble of transition pathways, and (4) allow simulation data to be reconciled with measurement data in a rigorous and explicit way. Here we sketch our current perspective of Markov models and explain in short their theoretical basis and assumptions. We describe transition path theory which allows the entire ensemble of protein folding pathways to be investigated and that combines naturally with Markov models. Experimental observations can be naturally linked to Markov models with the dynamical fingerprint theory, by which experimentally observable timescales can be equipped with an understanding of the structural rearrangement processes that take place at these timescales. The concepts of this paper are illustrated by a simple kinetic model of protein folding.
Markov state models parametrized using molecular simulation data are powerful tools for the investigation of conformational changes in biomolecules and in recent years have gained increasing popularity. However, a Markov state model is an approximation to the true dynamics of the complete system. We show how Markov state models are derived from the generalized Liouville equation identifying the assumptions and approximations involved and review the mathematical properties of transition matrices. Using two model systems, a two-bit flipping model consisting of only four states, and molecular dynamics simulations of liquid butane, we subsequently assess the influence of the assumptions, for example, of the marginal degrees of freedom, used in the derivation on the validity of the Markov state model.
Single-molecule force spectroscopy has proven to be a powerful tool for studying the kinetic behavior of biomolecules. Through application of an external force, conformational states with small or transient populations can be stabilized, allowing them to be characterized and the statistics of individual trajectories studied to provide insight into biomolecular folding and function. Because the observed quantity (force or extension) is not necessarily an ideal reaction coordinate, individual observations cannot be uniquely associated with kinetically distinct conformations. While maximum-likelihood schemes such as hidden Markov models have solved this problem for other classes of single-molecule experiments by using temporal information to aid in the inference of a sequence of distinct conformational states, these methods do not give a clear picture of how precisely the model parameters are determined by the data due to instrument noise and finite-sample statistics, both significant problems in force spectroscopy. We solve this problem through a Bayesian extension that allows the experimental uncertainties to be directly quantified, and build in detailed balance to further reduce uncertainty through physical constraints. We illustrate the utility of this approach in characterizing the three-state kinetic behavior of an RNA hairpin in a stationary optical trap.
The identification of metastable states of a molecule plays an important role in the interpretation of molecular simulation data because the free-energysurface, the relative populations in this landscape, and ultimately also the dynamics of the molecule under study can be described in terms of these states. We compare the results of three different geometriccluster algorithms (neighbor algorithm, K-medoids algorithm, and common-nearest-neighbor algorithm) among each other and to the results of a kinetic cluster algorithm. First, we demonstrate the characteristics of each of the geometriccluster algorithms using five two-dimensional data sets. Second, we analyze the molecular dynamics data of a β-heptapeptide in methanol—a molecule that exhibits a distinct folded state, a structurally diverse unfolded state, and a fast folding/unfolding equilibrium—using both geometric and kinetic cluster algorithms. We find that geometricclustering strongly depends on the algorithm used and that the density based common-nearest-neighbor algorithm is the most robust of the three geometriccluster algorithms with respect to variations in the input parameters and the distance metric. When comparing the geometriccluster results to the metastable states of the β-heptapeptide as identified by kinetic clustering, we find that in most cases the folded state is identified correctly but the overlap of geometricclusters with further metastable states is often at best approximate.
β-Peptides are analogs of natural α-peptides and form a variety of remarkably stable structures. Having an additional carbon atom in the backbone of each residue, their folded conformation is not only influenced by the side-chain sequence but also and foremost by their substitution pattern. The precise mechanism by which the side chains interact with the backbone is, however, hitherto not completely known. To unravel the various effects by which the side chains influence the backbone conformation, we quantify to which extent the dihedral angles of a β3-substited peptide with an additional methyl group on the central Cα-atom can be regarded as independent degrees of freedom and analyze the distributions of these dihedral angles. We also selectively capture the steric effect of substituents on the Cα- and Cβ-atoms of the central residue by alchemically changing them into dummy atoms, which have no nonbonded interactions. We find that the folded state of the β3-peptide is primarily stabilized by a steric exclusion of large parts of the unfolded state (entropic effect) and only subsequently by mutual dependence of the ψ-dihedral angles (enthalpic effect). The folded state of β-peptides is stabilized by a different mechanism than that of α-peptides
Introducing experimental values as restraints into molecular dynamics (MD) simulation to bias the values of particular molecular properties, such as nuclear Overhauser effect intensities or distances, dipolar couplings, 3 J-coupling constants, chemical shifts or crystallographic structure factors, towards experimental values is a widely used structure refinement method. Because multiple torsion angle values ϕ correspond to the same 3 J-coupling constant and high-energy barriers are separating those, restraining 3 J-coupling constants remains difficult. A method to adaptively enforce restraints using a local elevation (LE) potential energy function is presented and applied to 3 J-coupling constant restraining in an MD simulation of hen egg-white lysozyme (HEWL). The method succesfully enhances sampling of the restrained torsion angles until the 37 experimental 3 J-coupling constant values are reached, thereby also improving the agreement with the 1,630 experimental NOE atom–atom distance upper bounds. Afterwards the torsional angles ϕ are kept restrained by the built-up local-elevation potential energies.
The use of time-dependent restraints in molecular simulation in order to generate a conformational ensemble for molecules that is in accordance with measured ensemble averages for particular observable quantities is investigated. Using a model system consisting of liquid butane and the cyclic peptide antamanide the reproduction of particular average 3 J-coupling constant values in a molecular dynamics simulation is analysed. It is shown that the multiple-valuedness and the sizeable gradients of the Karplus curve relating 3 J-coupling constants measured in NMR experiments to the corresponding torsional-angle values cause severe problems when trying to restrain a 3 J-coupling constant to a value close to the extrema of the Karplus curve. The introduction of a factor oscillating with time into the restraining penalty function alleviates this problem and enhances the restrained conformational sampling.