An abbreviated history of coarse graining methods:
WHEN, WHO AND WHY?
During the late 1960s at the dawn of Levinthal’s paradox, understanding the mechanism of protein folding became a critical area of study. This was likely the primary driving force for the coarse grained method in order to capture the phenomenon of the entire protein folding process. It was Levitt and Warshel who were the first people to successfully accomplish this during their work in the early 1970s. Their coarse graining method, published in 1975, modeled the small, bovine pancreatic trypsin inhibitor protein using pseudoatoms at the alpha carbon positions; the model reduced the degrees of freedom to one which described rotation along the central pseudobond for the three consecutive alpha carbon atoms; the force field used was described by the Lennard Jones potential; and Brownian Dynamics was used as the sampling scheme. The pioneering work substantiated but fell short of describing the packing and pair-wise interactions of side chains that are integral driving forces for obtaining specific folded structures. Just a year later, Levitt proposed an improved model accounting for side chain orientation and since then the method of coarse graining has only continued to progress in accuracy and expand to other biomolecules like nucleic acids, carbohydrates, water and lipids. The development of this inaugural coarse grained model has led to the many advancements in the method that exist in the current state of the art as it has become a crucial computational modeling tool for simulating meso-scale biological phenomena.2,6,10
Coarse grained simulation basics:
WHY COARSE GRAIN?
Capturing the time and length scale of biological processes using computational methods remains a challenge for molecular modeling as resolution is limited to the order of less than 100 ns and 10 nm in all-atom detail. Coarse graining reduces the degrees of freedom of the system to achieve greater size and time scales at the expense of molecular detail to simulate biological processes currently inaccessible to all atom models. In addition to increasing the spatiotemporal limitations of all-atom molecular dynamics simulations, coarse graining is advantageous because it enables high throughput studies that systematically explore thousands of state conditions in parallel; it reveals which details are essential to reproduce higher resolution results which provides insight on the biological driving forces at play; it smoothes the potential energy landscape of the system allowing for a computationally inexpensive testing of novel biophysical pathways.3
Figure 1: Spatial and temporal scale for computational methods1
HOW TO COARSE GRAIN?
A complete coarse grained (CG) model accomplishes the following: (1) it must represent the atomistic model as simplified CG “beads” that encapsulate a desired approximation of a group of atoms and (2) it must have a set of potentials called a force field that describe how the beads interact with one another.
Figure 2: All atom to coarse grained description. In this simplified representation a 4:1 atom:bead ratio is used with gray beads for hydrocarbons, pink beads for glycerol, gold beads for phosphate and blue beads for choline-type.1
(1) Atom mapping:
Mapping is the first step to coarse graining and largely determines the accuracy, efficiency and transferability of the model.11 Since coarse graining relies on utilizing only essential degrees of freedom it is important to assess which are necessary to obtain the desired resolution for a biologically relevant representation of the system or phenomenon. There are two common methods of atom mapping used to coarse grain a system: shape and “residue” based.
Shape based: uses the highest level of simplification in order to model large scale motions therefore the least amount of beads as possible. Two CG beads represent each lipid bilayer leaflet, one for the head and one for the tail group, which are connected by a harmonic potential and the top and bottom leaflet interact via Lennard-Jones and Coulomb potentials. The shape-based CG lipid model allows for simulations of molecular dynamics on the order of hundreds of microseconds and beyond.19
Residue based: uses the strategy of making a cluster of connected heavy atoms the unit particle in which approximately 10-20 covalently bound atoms are mapped into one CG bead. While resulting in an inevitable loss of detail, this permits a longer timestep and thereby produces higher computational efficiency than united atom models. It is typical to use this family of methods when a 1-2 order of magnitude speed up from an all-atom simulation is desired.21
In addition to the resolution (amount of CG beads) used to represent a group of atoms, it is important to consider the placement of the CG beads with respect to the atomic structure. It is typical for small molecules like lipids to be locationally mapped using chemical intuition in which a CG site is used for each functional group in a molecule.5
(2) Force field building:
It is a continual challenge to develop accurate and transferrable CG force fields. The typical way to construct a CG simulation force field is using either the bottom up (structure based coarse graining) force matching of top down (thermodynamic based coarse graining) free energy based approach.
Bottom-up/Structure-based: uses effective interaction based on reference all-atom simulations. Common systematic methods include inverse Monte Carlo, iterative Boltzmann inversion and Force Matching aka variational fitting (IMC, IBI and FM).3 The IBI method utilizes a radial distribution function (RDF) from atomistic simulations to calculate a CG RDF function and iteratively improve the CG pair potential in reference to all atom pair potentials. The equation used to improve upon the CG pair potential for this iterative method is6:
Here, ��, is the RDF and, u, is the effective pair potential derived from the potential of mean force calculation. Essentially the equation states the a guess is input for the initial RDF and this is iterated numerous times until the value converges on the target RDF. Iteration continue until convergence on a unique pair potential for a given radial distribution is achieved within an acceptable error. Similarly, the IMC method utilizes the RDF as the target property to approximate a coarse grained Hamiltonian for the system but unlike the IBI method it explicitly handles the CG force field parameters by solving a series of linear equations at each iterative step to reach successful convergence.6 Voth’s multi-scale coarse grained (MS-CG) model described briefly is representative of the force matching method and utilizes a system of linear equations to find the least squares fit of sampled configurations from an atomistic simulation to optimize the CG force field.1,4 Inverse Monte Carlo differs from this because it is an exact Newton method and tries to fit the exact reference value versus the natural log whereas force matching extracts an optimal potential from large conformational data produced by ab initio calculations.
Top-down/Thermodynamics-based: uses analytical interaction potentials and is parameterized in an iterative procedure that aims to reproduce thermodynamic properties from experimental data, which typically results in a more easily transferable force field.3 The Martini force field is a commonly used example of the top-down method in which bonded interactions are specified by potential energy functions but is unique in its description of non bonded interactions.1 The intermolecular structure and thermodynamics of the target system depends on the non bonded interaction modeled using the Lennard-Jones potential and the parameters are chosen to reproduce specific experimental measurements, i.e. bulk density, free energy of hydration, vapor pressure, etc.12
Most CG force fields rely on a combinations of the two methods to optimize accuracy and transferability. Though not exhaustive the table below from Bradley et al. provides an excellent summary of force field construction types and their applicability.1
Table 1: Common force fields used in coarse grain simulations
Key Target Data
Structure matching, energy matching, Boltzmann inversion, reverse Monte Carlo
Density distributions, interfacial tension, area per lipid, bending modulus, area compressibility modulus, lipid order parameters
Bottom up force matching, variational optimization, hybrid analytic-systematic CG, screened electrostatics
Atomistic site to site radial distribution functions, density distributions, bending and area compressibility moduli, lipid diffusion rates
Top down energy matching, potential of mean force between phases, bilayer stress profile, free energy of lipid desorption or flip-flop, short range electrostatics
Free energy of hydration and vaporization, partitioning free energies, surface and interfacial tension, density distributions, bending modulus, area per lipid
HOW TO SAMPLE A COARSE GRAINED MODEL?
Force fields transform the biological systems flat conformational space into an extremely rugged surface. Sampling schemes allow for the surface to be traversed in search of desired system conformations. The two most commonly used schemes used in CG modeling are molecular dynamics (MD) and Monte Carlo (MC). MD schemes generate new conformations utilizing classical mechanics described by Newton’s equations of motion. Using this method, a trajectory is produced which describes the time evolution of the system and allows for the assessment of time dependent observables. MC schemes traverse the conformational space in a completely time independent manner that provides a random sample of conformations coming from the desired distribution, typically Boltzmann (this linked page goes more in depth on this sampling method: Monte carlo for Biomembranes). This method uses unphysical jumps in space to allow for escape from local energy minima and is an entirely statistics based algorithm.6
Coarse graining for membranes:
WHY COARSE GRAIN MEMBRANES?
While a single lipid molecule itself is relatively small, only about 100 atoms, the bulk properties of a lipid bilayer depends on the collective behavior of hundreds to thousands of individual lipids rendering atomistic models extremely computationally expensive. Real biological membranes are heterogeneous and contain multiscale properties with the typical membrane being on the macroscopic scale--microns and microseconds--which is currently inaccessible to atomic resolution molecular dynamics simulations.8
AND WHAT ABOUT WATER?
Given that water is an essential component of biological systems and typically comprising more than 80% of the system’s total particles it is necessary to reproduce both its structure and phase transitions to create a relevant and accurate coarse-grained model.3,11 The dynamic effects of water on biological systems arise from its hydrogen bonding, polarity and geometry which can not be fully captured using the CG method therefore the model must be parameterized to reproduce the desired macroscopic property.18 CG methods employ several different mapping and parameterization techniques that produce the following three categories of models: (1) implicit, (2) explicit and (3) polarizable.1
(1) The two typical strategies to account for water implicitly are by adjusting the non-bonded interactions between non-water molecules or adding a force field term that accounts for the hydrophobic effect using DeBye-Huckel theory of Generalized-Born models, which is based on modeling the solute as a set of spheres whose internal dielectric constant differs from the external solvent.3
(2) The two parameterization types for the explicit representation of a water molecule as a van der waals particle use either structure or thermodynamic methods. Structure based methods typically use a 1:1 molecule:bead ratio and are constructed from atomistic water simulations using IBI or FM. Thermodynamics based water models are parameterized by fitting to macroscopic experimental results (like density, diffusion rate, surface tension, etc.3) and use various mapping ratios.3,18
(3) Explicit water models lack charges which prevents them from screening electrostatics which is remedied by polarizable models. This is accomplished by incorporating charged pseudo particles controlled by an angle potential that allows for mimicking of water’s electrostatic screening due to its polarizability.17 The molecule: bead mapping varies from a 4:1 rigid V-shape, A 1:1 induced point dipole, or the 11:4 tetrahedrally coordinated water model, for instances.3
WHAT COARSE GRAINED MEMBRANE MODELS ARE COMMONLY USED?
In coarse grained models the observables from atomistic models are often defined by analogy and result in what is known as the representability problem.20 Practically, this means that the questions asked dictate which CG model is appropriate to use. In order to choose the right model for each problem, it is necessary to understand the applicability and limitations of your model.3 While numerous coarse grained lipid models exist some successful, commonly used coarse grained lipid models include3: (1) Klein9 (2) Martini12 (3) ELBA7 (4) Voth8 (5) Smit’s DPD13
(1) A pioneering model that constructs a CG forcefield from all atom simulation data studying a dimyristoylphosphatidylcholine (DMPC) membrane. While structurally accurate it has limited transferability that has since been improved upon by incorporating both top down and bottom up parameterization. The introductory model linked each of the 13 CG beads using harmonic bond and quartic angle potentials fitted to the atomistic simulation while non-bonded interactions were represented by radial distribution functions and refined with IBI. Recent applications of the model have extended the applicability to different molecular structures and environmental conditions.16
(2) This model was originally developed for lipids and is purposefully simplistic using minimal parameters and a few standard interaction potential in an effort to be versatile in its applications while maintaining accuracy. It uses a 4:1 atom:bead mapping with a fixed bead size with 18 different polarity/charge group types. Bonds and angles are described by harmonic potentials and VdW and electrostatics are described using shifted potentials. The model parameters are tuned to match structural and thermodynamic data from experimental and simulated systems. Applications are vast and range from raft domain formation to membrane tethers.
(3) An electrostatics based CG lipid force field that focuses on lipid water interactions designed for multiscale applicability. Each water molecule is represented by a Lennard-Jones soft soft sphere with a point dipole and the CG lipid beads incorporate electrostatics as explicit point charges or point dipoles with a relative dielectric constant of 1. The model was parameterized for bulk, liquid phase water and is particularly applicable for modeling lipid phase behavior and the movement of drugs and other molecules across lipids bilayers.
(4) Voth and colleagues have developed multiple CG lipid models that like the Klein model use a force field derived from all atom simulations termed a multiscale CG forcefield (MS-CG) that instead contains an analytical and systematic component that utilize force matching vs average structural properties. Their methods employ a range of atom mapping from an aggressive CG with a single bead to represent an entire lipid used to simulate very large systems8 to the typical 13-15 CG beads per lipid; model dependent treatments of electrostatic interactions and water representations, either explicit or implicit. Their models have been used to model several lipids like DMPC, DOPC, DOPE and cholesterol membranes and has been used to study a variety of biological phenomena including liposome (~200nm) self assembly and the phase behavior of binary mixed membranes.15
(5) This lipid model utilizes the soft repulsive forces used in dissipative particle dynamics that replace the Lennard-Jones potential to study the DMPC bilayer. The DPD repulsion parameter was determined using prior related DPD studies; their atom mapping consisted of a model comprising a lipid head group of 3 hydrophilic beads and the remaining 10 beads comprising 2 hydrophobic tails; and their model has been shown to describe multicomponent bilayer behavior with accuracy and speed.
Bradley, Ryan, and Ravi Radhakrishnan. “Coarse-Grained Models for Protein-Cell Membrane Interactions.” Polymers vol. 5,3 (2013): 890-936.
Schindler, Tanja, Dietmar Kröner, and Martin O. Steinhauser. "On The Dynamics Of Molecular Self-Assembly And The Structural Analysis Of Bilayer Membranes Using Coarse-Grained Molecular Dynamics Simulations." Biochimica et Biophysica Acta (BBA) - Biomembranes 1858.9 (2016): 1955-1963.
Ingólfsson, Helgi I. et al. "The Power Of Coarse Graining In Biomolecular Simulations." Wiley Interdisciplinary Reviews: Computational Molecular Science 4.3 (2013): 225-248.
Izvekov, Sergei, and Gregory A. Voth. "A Multiscale Coarse-Graining Method For Biomolecular Systems." The Journal of Physical Chemistry B 109.7 (2005): 2469-2473.
Zhang, Zhiyong et al. "A Systematic Methodology For Defining Coarse-Grained Sites In Large Biomolecules." Biophysical Journal 95.11 (2008): 5073-5083.
Kmiecik, Sebastian et al. "Coarse-Grained Protein Models And Their Applications." Chemical Reviews 116.14 (2016): 7898-7936.
Orsi, Mario, and Jonathan W. Essex. "The ELBA Force Field For Coarse-Grain Modeling Of Lipid Membranes." PLoS ONE 6.12 (2011): e28637.
Ayton, Gary S., and Gregory A. Voth. "Hybrid Coarse-Graining Approach For Lipid Bilayers At Large Length And Time Scales." The Journal of Physical Chemistry B 113.13 (2009): 4413-4424.
Shelley, John C. et al. "A Coarse Grain Model For Phospholipid Simulations." The Journal of Physical Chemistry B105.19 (2001): 4464-4470.
Kamerlin S. L . et al. “Coarse-Grained (Multiscale) Simulations in Studies of Biophysical and Chemical Systems” Annu. Rev. Phys. Chem. (2011): 62:41-64.
11. Noid, W. G. "Perspective: Coarse-Grained Models For Biomolecular Systems." The Journal of Chemical Physics 139.9 (2013): 090901.
12. Marrink S.J. et al. “The MARTINI force field: coarse grained model for biomolecular simulations.” J Phys Chem B. 2007; 111:7812–7824.
13. Kranenburg M, Nicolas J-P, Smit B. “Comparison of mesoscopic phospholipid-water models.” Phys Chem Chem Phys. 2004;6:4142–4151.
14. J. Baschnagel, et al. “Monte Carlo Simulation of Polymers: Coarse-Grained Models” Computational Soft Matter: From Synthetic Polymers to Proteins. (2004): 23; 83-140.
15. Lu, Lanyuan, and Gregory A. Voth. "Systematic Coarse-Graining Of A Multicomponent Lipid Bilayer." The Journal of Physical Chemistry B 113.5 (2009): 1501-1510.
16. Shinoda W, DeVane R, Klein ML. “Zwitterionic lipid assemblies: molecular dynamics studies of monolayers, bilayers, and vesicles using a new coarse grain force field.” J Phys Chem B. 2010;114:6836–6849.
17. Yesylevskyy, Semen O et al. “Polarizable water model for the coarse-grained MARTINI force field.” PLoS computational biology vol. 6,6 e1000810
18. Hadley, Kevin R, and Clare McCabe. “Coarse-Grained Molecular Models of Water: A Review.” Molecular simulation vol. 38,8-9 (2012): 671-681.
19. Arkhipov, Anton et al. “Four-scale description of membrane sculpting by BAR domains.” Biophysical journal vol. 95,6 (2008): 2806-21.
20. Wagner, Jacob W. et al. "On The Representability Problem And The Physical Meaning Of Coarse-Grained Models." The Journal of Chemical Physics 145.4 (2016): 044108.
21. Freddolino PL, Arkhipov A, Shih AY, Yin Y, Chen Z, et al. (2008) Application of residue-based and shape-based coarse graining to biomolecular simulations. In: Voth GA, editor, Coarse-Graining of Condensed Phase and Biomolecular Systems, Chapman and Hall/CRC Press, Taylor and Francis Group, chapter 20. pp. 299–315.