MD TUTORIAL
MD TUTORIAL
MD TUTORIAL
Setting up a simulation
(DL_POLY, NAMD, LAMMPS)
and analyzing the results (nMoldyn 4)
DL_POLY (http://www.stfc.ac.uk/CSE/randd/ccg/software/DL_POLY/25526.aspx):
This is a free parallel code that allows performing many different kind of simulations with a
large set of potential functions, so the user has a large freedom to do what he really wants. It
contains a GUI, but it is not particularly well developed, and in most cases the user will need to
write the input files manually (although some utilities exist to help with this).
The input file containing the potential parameters is quite clear and the manual is very well
written, so with some small effort a novice user should be able to write the needed input files to
simulate a relatively small molecule. However preparing the input files for a complex system
(e.g. a protein) may be quite hard.
At the ILL, another drawback is that the version compiled in the cluster is much less efficiently
parallelized than for NAMD or LAMMPS.
NAMD (http://www.ks.uiuc.edu/Research/namd/):
This is a parallel code designed to simulate large biomolecular systems. It has been developed by
the Theoretical and Computational Biophysics Group of the University of Illinois, so it is very
well integrated with the graphical package VMD, written by the same group.
Some very good tutorials are available, so using those together with the distributed force fields
for biological simulations (e.g. Charmm) and VMD, it is relatively easy to set-up a simulation of
a complex biophysical system, such as a protein or a membrane.
On the negative side, being a “bio” oriented software, the only non-bonded potential available is
the 12-6 Lennard-Jones potential.
Note that the program is free, but it requires registration. Therefore if you plan to continue
using NAMD, please register in the NAMD website.
LAMMPS (http://lammps.sandia.gov/):
This parallel code is very popular in materials science. It has many different options, so apart
from standard MD simulations it allows also to simulate coarse-grained or mesoscopic systems
and even do some very different tasks, such as non-equilibrium MD, Monte Carlo or QM/MM
coupling.
It contains an impressive set of different functional forms for the potential, integrators, and
options to apply different thermostats, barostats or constraints. Therefore it is possibly the code
offering the larger set of choices and more freedom to do “special” calculations.
While all these options are explained in a very complete manual (1254 pages), it is not always
easy to find the right options. This code is not associated to a given GUI or molecular builder, so
preparing the input files is not easy and requires possibly the use of other software or some
familiarity to write scripts or programs to help with this.
Note also that the order of the atoms is not preserved in the output file containing the final
configuration, which can be a little bit confusing.
In the following, you will find some examples showing how to prepare the input files and run the
simulations in some simple cases. There is not a unique procedure, so several possibilities using
different pieces of software or hand-written scripts will be shown. They can serve as starting
points to learn how to proceed when you want to perform a new simulation, but be aware that
you will probably need to spend some time working and modifying those programs or scripts in
order to simulate a more complex system.
In order to have a common environment, during this tutorial we will work with a virtual machine
(VM) where all the needed software has been installed. The OS of the VM is Linux Ubuntu.
Normally you need only a few input files (e.g. the PDB files containing the initial structures) to
follow this tutorial, which are given in the ~/mdanse/Files directory of the VM. However in
order to save you time in typing all the scripts or the different control files explained in the
tutorial, they have also been included in the ~/mdanse/Examples directory. In the same directory
you will find also all the examples discussed in the tutorial, including the output files. Thus in
case of problems or doubt, you can refer to these files.
2. Input files and helpful software
Each MD program requires several input files providing the initial atomic positions and box size,
the force field definition and parameters, and the commands controlling the simulation to be
performed (e.g. number of steps, temperature, output, etc.). Here we present a extremely brief
summary of the input files that you will need for using DL_POLY, NAMD or LAMMPS, but
you should read the manual of the program that you want to use in order to understand the
contents of each file and how they are organized.
DL_POLY:
Requires three files that must be compulsory named CONFIG, CONTROL, and FIELD.
CONFIG: Contains the dimensions of the unit cell, a key to indicate the type of boundary
conditions applied, and the coordinates of each atom (and eventually also velocities and forces).
FIELD: Contains all the force field information defining the nature of the molecular forces.
CONTROL: Contains all the directives controlling the simulation.
NAMD:
File.pdb: Contains the initial coordinates of all the atoms in the form of a PDB file.
File.inp: The parameter file containing the force field in either X-PLOR or CHARMM format.
File.psf: The topology file describing completely the system (a list of all bonds, angles,
dihedrals, etc. in the system).
File.conf: Contains all the options and values to be used in the simulation.
LAMMPS:
File.data: Contains the initial coordinates of the atoms and …
File.in: Simulation instructions and force field …
As said before, the big advantage of using Materials Studio is that it provides a single and unified
modeling environment that makes possible to build your system from scratch, optimize it,
simulate it and then doing some analysis on the trajectory and writing a table or a figure. In the
other cases (DL_POLY, NAMD and LAMMPS), the program is just an MD motor, so you will
need a combination of different programs to do all those operations. Exploring the web, you may
find different tools to build a system, convert between formats, create automatically (or help to
create) input files for one (or several) MD code(s). It can take some time to learn how to use
them and how to create a workflow allowing to perform the simulation and analysis that you
need. But once that you have done this a first time for a system of your interest it will be
relatively easy to apply the same workflow to simulate similar systems.
This is a list of programs and scripts that are mentioned in different parts of this tutorial. They
are all freely available, so you can install them in your own computer and play with them to learn
how to use them and exploit all their capacities:
- PDB (http://www.rcsb.org/pdb/static.do?p=file_formats/pdb/index.html) file format: This
is not a program, but a file format standing for Protein Data Bank. As the name indicates, the
goal of this format is to provide a convenient representation for macromolecular structures.
However it has become one of the most used format for all kind of systems, so this will be the
initial input that we will use in all our examples. Almost all (if not all) of the molecular builders
that you may use to build your system will allow exporting in PDB format.
- lammps2pdb.pl (http://redmine.scorec.rpi.edu/anonsvn/lammps-cuda/tools/ch2lmp/) is a
perl script to write a PDB file from a LAMMPS data file.
- Vega ZZ (http://www.vegazz.net/): We have not used this program in this tutorial, but Vega
ZZ is a molecular modeling suite that has many interesting features, as it provides good
visualization tools, a molecular builder, supports many different formats and has several
calculation tools, including a graphic interface to NAMD. The program is free for non-profit
academic users, but it requires registration.
3. Analysis
Each program has its own set of utilities allowing doing some basic analysis on the output files.
Some examples will be shown later. Further programs can be found in the web. However the
most useful MD analysis code (for a neutron scatterer) is possibly nMoldyn, so most of the
analysis presented here are done using the last version of this program, nMoldyn 4
(https://forge.epn-campus.eu/projects/nmoldyn/). Please note that this is still a beta version that
has just been released, so it has not been yet extensively tested. As the main developer is one of
us (E. Pellegrini), please report to him any bug or suggestions for improvement.
4. Examples
Here we present some examples showing how to set up different systems to simulate them using
either DL_POLY, NAMD or LAMMPS. They are all relatively simple systems, so the input files
are easy enough to write and understand. But at the same time, they are increasingly complex
(from a monatomic liquid to a rigid molecular liquid and then to flexible molecular liquids of
larger sizes), so hopefully they will provide a good basis to simulate the system you are
interested in.
In order to allow you to run the simulations in your own laptop, most of the examples shown
here are quite small (for today standards). However, for some of them we will show also how to
build a much larger system. If you work at the ILL, you should build and test your simulation in
your local computer, but then you can use our cluster (master) to perform long simulations. In
the cluster web page http://master.ill.fr you can find some help documents about how to use the
cluster and in the next section you will find some useful scripts and benchmarks concerning the
three MD codes installed in the cluster.
The examples shown here include a monatomic system (argon), a small fully rigid molecule
(water), and a slightly larger fully flexible molecule (ethanol). Some references (both
experimental and theoretical) are given, so that you can compare your results with the available
literature. Note also that even if those systems are quite well known, they still remain attractive
from a scientific point of view. For example, binary mixtures of LJ particles have been
extensively studied in the mid-90‟s – early 2000s as a good model to test the predictions of the
Mode Coupling Theory for supercooled liquids [W. Kob and H. C. Andersen, Phys. Rev. Lett. ,
73, 1376 (1994)]. And the water model presented here is quite recent [J.L. Abascal and C. Vega,
J. Chem. Phys. 123, 234505 (2005) ]and, for example, has been employed in the highly
controversial discussion about the possible existence of a liquid-liquid phase transition in water
(e.g. see D. T. Limmer and D. Chandler, J. Chem. Phys. 138, 214504 (2013) and references
therein). If you read those references, you can see that quite often the most challenging aspect is
not to perform the simulation, but to find new ways of looking and analyzing all the information
conveyed in the trajectory generated by the simulation.
In all the cases shown here, the starting point is simply a PDB file. As said above, in most cases
it should be relatively easy to find or prepare a PDB file containing the initial coordinates of all
the atoms of the system that you want to study. But in any case, we will also show for some
particular cases how to prepare the needed files from scratch.
Caution! All the commands and instructions given in this tutorial correspond to our VM
installation. If you try to follow this tutorial in your own system, you will have to take care of
installing all the programs and scripts mentioned and, naturally, the commands to give may vary
slightly depending on your particular installation .
4.1 A Lennard-Jones system (argon)
This is the simplest system that we can study. We just have N atoms interacting between them by
means of a LJ pair potential. You can find many papers in the literature concerning simulations
of LJ particles, but here we will use as reference the paper by Levesque, Verlet and Kürkijarvi
(Phys. Rev. A 7, 1690 (1973). They studied a LJ liquid close to the triple point using 864 atoms
in the NVE ensemble. The state point that they simulated is * = 0.8442, T* = 0.722 (in reduced
units, see e.g. http://www.matdl.org/matdlwiki/index.php/softmatter:Reduced_units for an
explanation). The results of the simulation can be compared to experimental data for liquid Ar by
using the following LJ parameters: = 3.405 Å, = 119.8kB. Using those values and the molar
mass of argon, M = 39.948, we find that the corresponding state point for liquid Ar is n =
0.02138 Å 3 ( = 1.4185 g/cm3), T = 86.5 K. As there are no fast intramolecular degrees of
motion, they could employ a large time step to integrate the equations of motion, t 10 fs.
The starting point is the PDB file “argon_864.pdb”, containing the positions of 864 atoms in an
ordered structure (see Rasmol figure below):
4.1.1 DL_POLY
CONFIG: We will use OpenBabel to convert the PDB file to the format required by DL_POLY.
Create a working directory (e.g. TutoMDANSE/Argon/DLPOLY/), copy the pdb file there and
type:
>> obabel –i pdb argon_864.pdb –o CONFIG –O argon_864.cfg
You can see that the conversion fails and that there is a series of messages like this one:
==============================
*** Open Babel Warning in OpenBabel::parseAtomRecord
WARNING: Problems reading a PDB file
Problems reading a HETATM or ATOM record.
According to the PDB specification,
columns 77-78 should contain the element symbol of an atom.
but OpenBabel found ' ' (atom 1)
==============================
The file “argon_864_v2.pdb” contains the atom element in columns 77-78, as requested by
OpenBabel, so try now:
>> obabel –i pdb argon_864_v2.pdb –o CONFIG argon_864.cfg
This creates the file “argon_864.cfg”, which has the format required by DL_POLY. But we will
have to edit the file and modify the header to add some missing information.
The first line is just a comment or title, so you can write the information you want. E.g. “Argon
test, 864 atoms”.
The second line contains two important keys that indicate the information included in the file
(coordinates / velocities / forces) and the periodic boundaries to use (see DL_POLY manual for
details). The PDB file contains only the atomic coordinates, so the first key is 0. But our file did
not contain any information on the size of the box, so the second key has also been set to 0,
which corresponds to non periodic boundaries. Change this value to 1, to tell the program that
we want to use cubic boundary conditions.
The next three records give the x, y and z components of the vectors a, b, and c defining the
simulation box. This information appears can appear in the PDB file (for example, in our
example you can see the definition of the box sizethe CRYST1 line), but quite often is not given
and in any case OpenBabel does not interpret it. Thus we have to add it to our CONFIG file. The
ordered structure used as a starting point corresponds to a cube of side L = 34.739 Å, so we
could use this value. However as we want to simulate the system at a different density, we can
already give the box side that we need, i. e. L = 34.31415 Å. So copy the following lines in the
lines 3-5 of the CONFIG file:
34.3141500000 0.0000000000 0.0000000000
0.0000000000 34.3141500000 0.0000000000
0.0000000000 0.0000000000 34.3141500000
Caution! Modifying in this way the box limits can be dangerous. In the present case, the change
is small and we have a monatomic system, so we can do this safely. But in most cases changing
L can result in very close contacts between pairs of atoms and wrong intramolecular distances
because of the application of the periodic boundary conditions with a new box size.
Caution! The format of the CONFIG file is not free, and all the real numbers have the format
„f20.0‟, indicating that they are written using up to 20 spaces. In a similar way, all the integers
are written with the format „i10‟. So be careful to respect this format when adding those lines.
This file should be fine to use with DL_POLY, but unfortunately there is a bug in OpenBabel
that puts an extra space between the X and Y and the Y and Z coordinates. As DL_POLY
requires a fixed format for CONFIG, if you try running the program with this file you will get
immediately the following error message:
At line 1195 of file setup_module.f (Unit 10 "CONFIG")
Traceback: not available, compile with -ftrace=frame or -ftrace=full
Fortran runtime error: Bad value during floating point read
import sys
try:
output = str(sys.argv[2])
except:
output = 'CONFIG'
filename = str(sys.argv[1])
f = open(filename, 'r')
g = open(output, 'w')
lines = f.readlines()
FIELD: Here we have to give the definition of our system and all the necessary force field
parameters. In this case we have only one type of atom and they interact through a simple LJ
potential, so the file is very simple and can be easily written by hand. Use your preferred editor,
open a new file and copy the following lines:
Lennard-Jones system: 864 Ar atoms
units kJ
molecules 1
Argon
nummols 864
atoms 1
Ar 39.948 0.0
finish
vdw 1
Ar Ar lj 0.996073 3.405
close
As you can see, the file is very simple and hopefully almost self-explanatory. In any case, you
can (should) check the manual to see the exact meaning of each line and keyword and which are
all the available options.
Here the first line is just a title or comment, and the second tells to the program which are the
energy units that we want to employ in our simulation (both for the input and the output). In our
case, we are going to work in kJ/mol (other possibilities in DL_POLY are kcal/mol, eV, K, and
internal units).
In the following lines we define our system as containing a single molecular type named
“Argon”, that we have 864 of such molecules and that each molecule is constituted of a single
atom. The next line gives the atom name, and its mass and charge and the directive “finish”
indicate that we have completed the definition of our molecule. Here the atom names must be the
same as the atom names employed in the CONFIG file.
Finally we have to indicate the number of non-bonded interactions, which in our case is just one
(between 2 Ar atoms), and define them. There are many different pair potential functions
available in DL_POLY, and for the LJ potential we can choose between the standard or the 12-6
representations. We can choose the standard one (key “lj”) and then we have to give the
corresponding values for the parameters (in the selected energy units, i.e. kJ/mol in this case)
and (in Å).
CONTROL: Here we are going to give the simulation conditions (T, P, ensemble, etc.) and all the
directives controlling the simulation (number of steps, time-step, cut-off, …). In this example we
just need a few instructions, so open a new file in your editor and copy the following lines:
Simulation of a LJ system: 864 Ar atoms
integrator leapfrog verlet
temperature 86.5 K
steps 1000
timestep 0.005 ps
ensemble nve
no elec
cutoff 12.0
delr 2.0
print 1
stack 1
stats 1
# total time allowed for the job: 10000 secs = 2.7 hours
job time 10000
# time to write and close all data files at end of processing
close time 1000
finish
We will not describe all the commands here (again you should read the manual to see the
meaning of each command and all the available possibilities), but it is easy to see that we are
going to simulate our system at 86.5 K for 1000 steps, using a time step t = 5 fs (0.005 ps) in
the microcanonical ensemble (constant NVE) and applying a cut-off Rc = 12 Å for the pair
interactions. The directive “delr” gives the additional width employed to create the neighbor list,
so for each atom the program has a list of all the other atoms that are situated at a distance < R c +
r and uses this list (and not the full system) to determine which pair of atoms are separated by a
distance r < Rc and therefore contribute to the computation of the energy and the forces. This is
just a “trick” to accelerate the simulation. Naturally those lists are updated whenever an atom has
moved over a distance > r.
So now we are ready to do our simulation. Copy the three files that we have just created
(CONFIG, FIELD, CONTROL) into a new directory (e.g. Run1), move to that directory and
type: >> DLPOLY.X
After a few seconds, the run will be finished and you will have four new output files:
REVIVE is a binary file that contains all the information needed to continue the simulation.
REVCON contains the last configuration in the DL_POLY format. As you can check, this final
file contains not only the atomic positions, but also the velocities and forces.
STATIS contains a series blocks giving the instantaneous values of several quantities
(temperature, pressure, volume, energies, etc.) They are saved every N simulation steps
(determined by the print directive in CONTROL).
OUTPUT is a user-readable file summarizing all the important information concerning the
simulation: initial conditions, warnings, thermodynamic values along the simulation, final
averages, etc. It is a good practice to always read this file to check quickly that everything has
proceeded as expected and that there are no relevant warning messages or strange values.
In this first test, we did not save the trajectory. This can be done using the trajectory directive in
CONTROL and giving the desired values for the starting time, saving frequency and information
to be stored (positions / velocities / forces). In this case, we would also have in the output a file
named HISTORY and containing the trajectory of our system.
It is always advisable to start checking that the last configuration looks right and that nothing
strange has happened during the simulation. In order to do this we can convert the REVCON file
to the PDB format using OpenBabel:
>> obabel –i CONFIG REVCON –o pdb -O REVCON.pdb
And now we can use for example Rasmol to visualize it: >> rasmol REVCON.pdb
Caution! When you load the file, the Rasmol window will remain black. This is because the
default displaying in Rasmol is “Wireframe”. Go to the Display menu (right click on the screen
to make appear the menu) and select “Spacefill” or “Ball & Stick” to see the argon atoms.
You can see that the atoms have moved slightly from the initial positions, but the system is still
too ordered and certainly not a liquid, as we could have expected. We can try to make a larger
simulation, but it is not certain that this will be enough to “melt” our crystal. Why?
Check the OUTPUT file and try to answer this question before continuing reading …
If you look to the final averages written in the OUTPUT file, you will see that the average
temperature is ~51 K. This is much lower than the desired temperature for our simulation, 86.5
K. Can you think of a possible reason?
The problem is that we are working in the NVE ensemble, so we do not have any way of
controlling the temperature. Initially the program will assign random velocities to each atom,
according to a Maxwellian distribution corresponding to T = 86.5 K, but then kinetic and
potential energy can be interchanged, so we are not assured of maintaining this temperature.
At this point is useful to check how the thermodynamic variables have evolved during the
simulation. The required values are stored in the file STATIS. You can try to write a program or
script to extract one value from each block (e.g. the temperature) and plot it as a function of the
simulation time. Or as an alternative you can use the Java GUI provided with DL_POLY. Type:
>> java –jar PATH_TO_DL_POLY_DIRECTORY/java/GUI.jar
and a new window should appear.
In the DL_POLY GUI select “Analysis Statistics”. Then choose a property and click on Run.
For example, here we show the evolution of T during the 5 ps of the simulation. As we expected
from the information in the OUTPUT file, the system is much colder that what we initially
wanted:
The internal energy exhibits the opposite behavior:
And the variation of the total energy is much smaller, but nevertheless is too high, as in the NVE
ensemble the total energy should be conserved:
So we need first to try to equilibrate our system maintaining the temperature at 86.5 K. Create
another directory (e.g. Run2) and copy the files CONFIG, CONTROL, and FIELD to it. Edit the
CONTROL file and increase the number of steps to 5000 and change the ensemble directive to
“ensemble nvt ber 0.1”. In this way we will apply a thermostat (in this case Berendsen‟s
thermostat with a relaxation time of 0.1 ps) to our system in order to keep the temperature close
to the desired value. Now run again the simulation and check the results.
You can verify that the average temperature now is close to 86 K and that the final configuration
looks much more disordered.
We can now use the final configuration obtained in this NVT run as our next starting point.
Create another directory (e.g. Run3), copy the last REVCON file there, and rename it as
CONFIG. Copy the file FIELD as well. And edit a new CONTROL file with the following
content:
Simulation of a LJ system: 864 Ar atoms
integrator leapfrog verlet
temperature 86.5 K
steps 100000
equilibrate 50000
scale 10
timestep 0.005 ps
ensemble nve
no elec
cutoff 12.0
delr 2.0
print 100
stack 100
stats 100
traj 50100 100 0
# total time allowed for the job: 10000 secs = 2.7 hours
job time 10000
# time to write and close all data files at end of processing
close time 1000
finish
This corresponds to a much larger run (105 steps, i.e. 500 ps), where the first half is just to
equilibrate the system. During this first half the atom velocities will be rescaled every 10 steps in
order to maintain the desired temperature. After the first 50000 steps corresponding to the
equilibration period, the velocity scaling will not be active, so the rest of the simulation will be
performed in the true NVE ensemble. We will write the thermodynamic values every 100 steps
and now we will also save the trajectory, starting on the step 50100 (once the equilibration is
over) and then saving only the atomic positions every 100 steps. Thus we will have a trajectory
HISTORY file containing 500 atomic configurations saved every 0.5 ps, for a total simulation
time of 250 ps. This run will last several minutes, so launch it in the background:
>> DLPOLY.X &
Once the job is finished, open the OUTPUT file and check the average thermodynamic values of
our simulation. They are written in the last part of the file, and as you can see, they have been
computed using only the 2nd half of the simulation. The average temperature now is very close to
the temperature that we wanted to simulate, T = 86.3 K 1.6 K (if we use the r.m.s. fluctuation
of the instantaneous T as an estimate of the error bar). The total energy of the system is well
conserved now and the fluctuations in Etot are sufficiently small, (Etot Etot )2 ½/Etot 3 10 5
As a rule of thumb, a value < 10 4 indicates a reasonable energy conservation. You can compare
this with the fluctuations of the potential and kinetic energy, which are not conserved quantities.
Use again the java GUI to plot the evolution of Etot, U, P and T during the simulation. Here all
the values are shown, including those corresponding to the equilibration period, so you can see
very clearly the drastic reductions in the fluctuations of the total energy once the equilibration
period is over. This is due to the application of the velocity scaling during the equilibration,
which makes that we are not in the NVE ensemble and therefore Etot is not a conserved quantity.
When the velocity scaling is not applied, the total energy is well conserved, as we deduced
before from the values for the average total energy and the r.m.s. fluctuations given in OUTPUT.
From the final averages given in the OUTPUT file we have:
T = 86.3 1.6 K T* = kBT/ = 0.720 0.013
3 * 3
= 0.021384 Å = = 0.8442
P = 0.061 0.038 kbar P* = P 3/ = 0.15 0.09 and P /( kBT) = 0.24 0.15
U = 5236 17 kJ/mol (this is for 1 mol of simulation boxes) U*/N = U/ /N = 6.08 0.02
We observe that those values are in very good agreement with the results obtained by Levesque
et al. It is possible to compute other thermodynamic properties, such as the specific heat, from
the fluctuations in the kinetic or potential energy. But remember that the exact expressions to use
will depend on the particular mechanical statistical ensemble that we have simulated (they will
not be the same for the microcanonical ensemble where the total energy is conserved or for the
canonical one). It is important to check these thermodynamic quantities and compare them with
experimental values in order to validate a simulation. But most of the quantities of interest to us
(notably S(Q), F(Q,t), and S(Q, ) for neutron scattering) depend on the atomic positions and
their evolution in time, so we need to analyze the trajectory generated by our MD motor. In
section 4.1.4 we will show how nMoldyn 4 can help with this.
4.1.2 NAMD
Configuration file: NAMD uses a PDB file to read the atomic coordinates, so we can use directly
our file “argon_864.pdb”.
Parameter file: This case is simple enough to allow us to write this file manually. Open your
favorite editor, copy the lines below and save it as “argon_par.inp”:
*>>>>>> LJ potential for Argon <<<<<<<<<<
! Monatomic system --> No intramolecular terms
NONBONDED
! V(LJ)=Eps,i,j[(Rmin,i,j/ri,j)**12-2(Rmin,i,j/ri,j)**6]
!
! epsilon: kcal/mole, Eps,i,j = sqrt(eps,i * eps,j)
! Rmin/2: A, Rmin,i,j = Rmin/2,i + Rmin/2,j
!
! atom ignored epsilon Rmin/2 ignored eps,1-4 Rmin/2,1-4
!
Ar 0.000000 -0.238067 1.91099
END
In NAMD we cannot choose between different potential functions. The LJ potential is written as:
and we must provide the well-depth of the potential in kcal/mol and Rmin/2 in Å. The well depth
must be given with a negative sign and that makes that the first term is ignored (see CHARMM
documentation, e.g. in http://www.charmm.org/documentation/c32b2/parmfile.html for more
details). In our example we have that Emin = = 0.238067 kcal/mol and Rmin = 21/6 = 3.82198 Å.
Topology file: We also need to write a topology file for a single argon “molecule”:
*>>>>>> CHARMM topology for Argon <<<<<<<<<<
27 1
MASS 1 Ar 39.948 Ar
RESI Ar 0.00
GROUP
ATOM Ar Ar 0.00
END
Here we need to start declaring the masses of all the atoms in the system (in our case only
argon). In the MASS line we need to give a unique integer for each different type of atom,
followed by the atom type name, its mass and the element. Then we define a residue (remember
that NAMD is a “bio”-oriented package) named „Ar‟ and neutral (total charge = 0.0) and in this
case composed of a single group and a single atom. In each ATOM entry we have to give its
name and its type name. Each atom in the residue must have a different name and the type names
must agree with the types declared in MASS.
We need now to create the PSF (protein structure file) required by NAMD. This file contains all
of the molecule-specific information needed to apply a given force field to a molecular system.
We can use VMD to generate it from the topology file. Launch VMD by typing >> vmd on a
terminal window, then open the Tk Console (from the extensions menu) and type the following
commands in the console:
% cd PATH_TO_DIRECTORY_CONTAINING_PDB_AND_TOPOLOGY_FILES
% package require psfgen
% resetpsf
% topology argon_top.inp
% segment SOL {pdb argon_864.pdb}
% coordpdb argon_864.pdb SOL
% writepsf argon_864.psf
This will generate the PSF that we need. Note that you can also write all this commands into a
text file and then run it in the VMD console typing: % source file_name
Configuration file: Now we only need to create the configuration file specifying what dynamics
options and values NAMD should use. Edit a file named “argon_864.conf” and copy the
following text:
Those commands will essentially perform the same simulation that we did before with
DL_POLY. The main difference is that NAMD applies a switching function to the non-bonded
potential to shift it smoothly to zero at the cut-off distance. This behavior is controlled by the
switching directive (on / off) and the switchdist value that tells NAMD the starting distance at
which the switching function must be applied.
Create a folder (e.g. Run1) and copy there the four files that we will need to run NAMD:
argon_864.pdb, argon_864.psf, argon_par.inp, and argon_864.conf. Then type:
>> namd2 argon_864.conf > argon_864.log
The program output will be redirected to the file “argon_864.log”. Once the program has
finished, you should open this file and give it a quick look to understand what NAMD has done
and check that everything worked fine. You can go to the end of the file and check the last
temperature printed. As before, you will find that the system that we just simulated is too cold.
The final positions have been saved in the file argon_864_final.coor. This is a PDB file, so you
can visualize it directly using Rasmol or VMD. Again you will find that the system has remained
too ordered. NAMD also writes the final velocities with the same format (argon_864_final.vel)
and the cell vectors a, b, and c (argon_864_final.xsc).
The thermodynamic values along the simulation have been printed in the log file. They can be
plotted with VMD using Extensions Analysis NAMD plot. In the NAMD plot window
select a log file (in the File menu), then check the boxes corresponding to the properties you
want to see, and select the plot option (in the File menu). As you can see, we observe the same
behavior as before (as it should be).
So now we will make a NVT simulation to equilibrate our system at 86.5 K. Create a new
directory (e.g. Run2) and copy the same files there input files there. Edit the configuration file
change the number of steps to 5000, the output name to argon_864_run2 and add the following
lines:
# Constant Temperature Control
langevin on ;# do langevin dynamics
langevinDamping 1 ;# damping coefficient (gamma) in ps^-1
langevinTemp $temperature
Run this simulation and then check the last configuration and the evolution of T, U, and Etotal.
We can see that now the temperature remains close to the desired value, so we will continue the
run from here, but equilibrating for a longer time. Copy the final configuration
argon_864_run2.coor to a new folder, e.g. Run3, and modify the configuration file to read the
initial coordinates from this file and to perform a 250 ps simulation. We could continue the
previous NVT simulation with the Langevin thermostat, but we can also use the simple velocity
rescaling scheme. To do this edit the configuration file, remove the lines describing the Langevin
thermostat and add:
# velocity rescaling
rescaleFreq 10
rescaleTemp $temperature
Add also the command „restartfreq 10000‟ in order to write a restarting file. We will need this
file in the following step to make our production run. Change also the number of steps to 50000
and outputEnergies and outputPressure to 100 to avoid writing too many values in the log file.
Run the simulation and then check the results, as usual.
Now create a new directory (e.g. Run4) and copy there the common files „argon_864.psf‟ and
„argon_par.inp‟, the final PDB file obtained in the previous run (argon_864_run3.coor) and also
the binary files “argon_864_run3.restart.coor” and „argon_864_run3.restart.vel‟ also generated in
the previous run. The new configuration file „argon_864_run4.conf‟ could be like this one:
import sys
import numpy
filename = str(sys.argv[1])
f = open(filename, 'r')
line = f.readline()
data = f.readlines()
ndata = len(data)
print "Number of points = ", ndata
t = []
for element in data:
a = element.split()
t.append(float(a[1]))
u = []
for element in t:
delta = (element - average)**2
u.append(delta)
This will compute the average value of the quantity stored in the input file, as well as the root
mean square deviation from the average and the ratio between both. Type:
>> namd_rms.py e_total.dat
You will see that the average value of the total energy is 986.73 kcal/mol and that (Etot
Etot )2 ½ / Etot 3 10 6. As you can see, the total energy is very well conserved, as it should be
for a NVE simulation. If you compare with the previous simulation, you can see that now Etot is
even better conserved than before, which is due to the use of the switching function. For the
potential energy, instead, we have (U U )2 ½ / U 3 10 3. Similarly
>> namd_rms.py temp.dat gives T = 86.2 K with T 1.5 K, and
>> namd_rms.py pressure.dat gives P = 137 bar with T 39 bar.
4.1.3 LAMMPS
Data file: This file must contain the atomic positions and the full list of bonds, angles, dihedrals
and improper dihedrals that are present in our system. Some information (such as the potential
parameters) can also be given here or in the input file, as the user prefers. We can generate this
file from a PDB file using VMD and the TopoTools package. Open VMD and start the Tk
Console (from the extensions menu) and type the following commands in the console:
% cd PATH_TO_DIRECTORY_CONTAINING_PDB_ FILE
% package require topotools
% mol new argon_864.pdb
% set sel [atomselect top all]
% $sel set mass 39.948
% pbc set {34.31415 34.31415 34.31415 90.0 90.0 90.0}
% topo writelammpsdata argon_864.data full
Now you can then edit the file “argon_864.data” and explore its contents in order to get familiar
with it.
Input file: This file will contain the potential definition and parameters and the directives to
control the simulation. Again we will create it manually. In your editor, create a new file and
name it “argon_864_run1.in”. Copy the following lines:
units real
atom_style full
boundary p p p
variable temp equal 86.5
read_data argon_864.data
pair_style lj/smooth 10.0 12.0
pair_coeff 1 1 0.238067 3.405
timestep 5.0
velocity all create ${temp} 102486 mom yes rot yes dist gaussian
fix NVE all nve
thermo 1
thermo_style custom step temp press etotal ke pe
run 1000
write_data argon_864_run1.data
write_restart argon_864_run1.restart
The initial coordinates are read with the command read_data followed by a correct LAMMPS
data file. The potential is defined here with the pair_style command. In this case we will use a LJ
potential with a force smoothing applied between 10 and 12 Å. The potential parameters are
given with pair_coeff. Here we have just one type of interaction between two Ar atoms (type 1)
and the parameters to use are = 0.238067 kcal/mol (as we have selected real units) and =
3.405 Å. The time step is 5 fs and every step we will write to the output file the number of the
step, T, P, Etot, Ekin, U, EVDW, Elong, and Etail.
As usual, you should consult the manual to get a better idea of what each command does. This is
particular true for the case of LAMMPS, as most of the directives allow a large number of
different options.
Copy those two files to a new directory (e.g. Run1) and run LAMMPS:
>> lammps-daily < argon_864_run1.in > argon_864_run1.log
Check the output file „argon_864_run1.log‟ (LAMMPS also writes an almost identical file
named log.lammps). As we could expect, we observe the same behavior than with the other two
codes, i.e. T has decreased and the system remains ordered. In order to check the final
configuration, we need to transform the data file containing the final configuration. This can be
done with a perl script. Type:
>> perl lammps2pdb.pl argon_864_run1
This will write a file named argon_864_run1_trj.pdb (and also a psf file) that can be visualized
with VMD or Rasmol.
If you edit and look the file „argon_864_run1.data‟ you will see that the atoms are not written in
consecutive order (look to the first number corresponding to the atom ID). This is a characteristic
feature of LAMMPS.
As before, we can make a NVT run in order to ensure that T is maintained close to the desired
value. You can do this by including the following lines in the input file:
# NVT ensemble (Berendsen)
fix fxtemp all temp/berendsen ${temp} ${temp} 100.0
fix fxnve all nve
The first fix (as they are called in LAMMPS) will apply Berendsen‟s thermostat. The two
temperature values required correspond to the desired temperatures at the start and end of the
run, while the last value is the relaxation constant (in this example 100 time units, i.e. 100 fs for
units real). A very important point is that this fix modifies the velocities to effectuate the
thermostating, but it does not perform the time integration. This is why we need to keep the
“fix nve” to update the positions.
We will use this last configuration to perform now a larger equilibration using the simple
velocity scaling and followed by a production run. Copy the file argon_864_run2.data to a new
directory (e.g. Run3) and prepare the following input file:
units real
atom_style full
boundary p p p
variable temp equal 86.5
pair_style lj/smooth 10.0 12.0
#pair_coeff 1 1 0.238067 3.405 # LJ params: epsilon
(kcal/mol), sigma (Angstroms)
read_data argon_864_run2.data
timestep 5.0
velocity all create ${temp} 102486 mom yes rot yes dist gaussian
# velocity rescaling
fix fxvel all temp/rescale 10 ${temp} ${temp} 0.05 1.0
fix fxnve all nve
thermo 100
thermo_style custom step temp press etotal ke pe
run 50000
write_data argon_864_run3_equil.data
write_restart argon_864_run3_equil.restart
# remove velocity scaling
unfix fxvel
# write trajectory
dump dumpTrj all custom 100 argon.trj id mol type x y z ix iy iz
dump_modify dumpTrj sort id
run 50000
write_data argon_864_run3.data
write_restart argon_864_run3.restart
Note that the pair_coeff directive has been commented. This is because LAMMPS has written
this directive in the data file. Note also that the read_data command has to appear now after the
pair_style one, because the potential function needs to be declared before giving the parameters.
The temperature is maintained by rescaling the velocities (fix temp/rescale) and, as with the
Berendsen thermostat, the fix nve is needed to integrate the equations of motion. 50000 steps are
done with this conditions and the final configuration saved as “argon_equil”. Then the velocity
scaling is disabled using the “unfix ID” command and another 50000 steps are done. During this
phase, the trajectory is saved every 100 steps (dump line). The dump_modify command allows
the keep the atom order in the written trajectory by using the “sort id” option. This trajectory file
can be directly visualized with VMD.
One possibility to analyze the log file generated by LAMMPS is to install and use the Pizza.py
toolkit (http://pizza.sandia.gov/index.html). Otherwise a simple to way to visualize the
thermodynamic data is to edit the log file, copy the desired block to a new file, and then use your
preferred program to plot or manipulate them. For example, copy the block corresponding to the
production run to the file “argon_864_run3_thermo.dat”, in such a way that the first and last
lines are:
Step Temp Press TotEng KinEng PotEng
50000 86.5 176.20892 -1001.4665 222.51598 -1223.9825
50100 86.821615 157.07836 -1001.3659 223.34331 -1224.7092
…
99900 87.325654 178.30418 -1001.4629 224.63992 -1226.1029
100000 84.228541 223.66563 -1001.4037 216.67279 -1218.0765
And then run this python script by typing >> lammps_rms.py argon_864_run3_thermo.dat
"""
Script to analyze the data block extracted from a LAMMPS log file
and compute the average and RMS deviations of the data.
import sys
import numpy
filename = str(sys.argv[1])
f = open(filename, 'r')
header = f.readline()
h = header.split()[1:]
data = f.readlines()
ndata = len(data)
print "Number of points = ", ndata
# First line
a = data[0].split()[1:]
t = numpy.array(a).astype(float)
Do the same thing for the static structure factor (a reasonable choice for the Q values is from 1 to
250 nm 1 by step of 1 nm 1) and compare to the experimental S(Q) (file sq_yarnell_ND.dat).
Now go to the “Dynamics” category and compute the Mean Square Displacement. You can see
that r2(t) exhibits a linear dependence with time, as expected for a simple liquid exhibiting
Brownian translational diffusion. From the slope of the curve we can obtain the self-diffusion
coefficient, D, as r2(t) = A + 6Dt. Fit the computed m.s.d. in order to get it and compare to
experimental values.
If you have done the simulations with the three different codes you will notice that the dynamic
properties are more sensible to the exact simulation conditions than the structural ones. From the
slope of those curves we get a self-diffusion coefficient in the range D = 1.60 – 1.80 x 10 3
nm2/ps = 1.60 – 1.80 x 10 9 m2/ps. The differences between the three codes can be due to a
combination of statistical variance (you would need to perform several independent simulations
with the same code and conditions to determine how much the m.s.d. can change) and the effect
of the cut-off conditions (use of switching function or not and exact switching function
employed). Note also that the deviation between the curves increases with time. You have to
keep in mind that the statistical noise of each point increases with time, as the number of time
origins that we can employ to calculate decreases from M (where M is the number of frames that
we have saved) at t = 0, M 1 at t = t (where t corresponds now to the saving interval, not the
simulation time-step), M 2 at t = 2 t, …, until 1 at t = (M 1) t (where we can only use the first
and last saved frames to compute r2(t)).
In the “Scattering” category you can find the relevant analysis to compare to neutron scattering,
such as the calculation of the incoherent and coherent intermediate scattering functions, and the
corresponding dynamic structure factors.
Intermediate incoherent scattering function at several Q values. The continuous lines show the
theoretical curve for a Brownian diffusion motion with D = 1.7 10 3 nm2/ps:
You can see that this simple model reproduces reasonably well the simulated curves at the lower
Q (here 5 nm 1), but deviations appear at larger Q values.
Direct Fourier Transform of the intermediate incoherent scattering function without applying any
resolution function (selecting “ideal” resolution in the Dynamic Incoherent Structure Factor
interface). The lines show the fits obtained with a single lorentzian function plus a background:
The fits give with D = 2.8 10 4 THz nm2. Note that nMoldyn uses the frequency ( or f given in
THz) for the energy axis and not the angular frequency . To get the self-diffusion coefficient
we need to convert to = 2 , obtaining D = 2 2.8 10 4 nm2/ps = 1.76 10 9 m2/s.
And you can also do other analysis based on the atom velocities, instead of the positions, such as
the velocity autocorrelation function and the density of states:
4.2 A small rigid molecule (water)
Water is probably the most widely studied system both experimentally and computationally. But
nevertheless many questions related to their anomalous properties remain. From a simulation
perspective, many different models have been proposed in order to reproduce the properties of
liquid water and ice from MD simulations. Here we will explore a recent model proposed
recently as an improvement of the older TIP4P potential [J.L.F. Abascal and C. Vega, J. Chem.
Phys. 123, 234505 (2005)]. This TIP4P/2005 potential is a very simple model that represents
H2O as a rigid molecule having four sites (the three atoms + an additional site to place the
negative charge associated to the lone pair (LP) of the oxygen). The interaction between a pair of
water molecules is given by the sum of the electrostatic interactions between the partial charges
placed in the H atoms and the LP site plus a Lennard-Jones potential acting between the O sites.
The geometry of the molecule and the model parameters are:
d(O-H) = 0.9572 Å, (H-O-H) = 104.52 , d(O-LP) = 0.1546 Å;
q(LP) = 1.1128 e, q(H) = +0.5564 e
= 3.1589 Å, = 93.2 kB
In spite of its simplicity, the TIP4P/2005 has proved very successful in reproducing many
properties of water [C. Vega and J.L.F. Abascal, Phys. Chem. Chem. Phys. 13, 19663 (2011)].
Here we will use it to study liquid water at ambient conditions by simulating a system containing
360 water molecules.
As in the case of argon, our starting point will be a PDB file containing the positions of the 1080
atoms (360 O + 720 H). The first problem to solve is that our PDB file (water_360.pdb) does not
contain the site corresponding to the lone pair, so we have to generate a new file with this
additional site. A Python script will do the job. Run:
>> pdb_water_to_tip4p.py water_360.pdb waterTIP4P2005_360.pdb tip4p2005
You can edit the new file generated and check that in each water molecule there is a fourth site
corresponding to the long pair. The python script writes „Du‟ as the element name (columns 77-
78 of the PDB file). This stands for „dummy‟ and is required by nMoldyn for a successful
conversion of the trajectory. If you want you can visualize the configuration using Rasmol or
VMD and check that the script rebuilds the water molecule with the correct geometry
corresponding to the TIP4P/2005 model.
4.2.1 NAMD
Configuration file: waterTIP4P2005_360.pdb.
Topology and parameter files As for the case of argon, we can write the parameter and topology
files manually. However it is usually a good idea to start checking if our molecule is included in
one of the standard CHARMM force fields used by NAMD (which are available at
http://mackerell.umaryland.edu/charmm_ff.shtml#charmm). For example, if we explore the
CHARMM36 files, we see that the TIP4P/2005 water model is not inside, but we will find a
TIP3P water. This model puts the negative charge on the oxygen, so it does not have a 4th site for
the lone pair, but it can be a reasonable starting point.
In the topology file we have:
RESI TIP3 0.000 ! tip3p water model
GROUP
ATOM OH2 OCTIP3 -0.834
ATOM H1 HCTIP3 0.417
ATOM H2 HCTIP3 0.417
BOND OH2 H1 OH2 H2 H1 H2 ! the last bond is needed for shake
ANGLE H1 OH2 H2 ! required
PATCHING FIRS NONE LAST NONE
And from the parameter file we can extract the following information for this water model:
ATOMS
! tip3p water
MASS 1 HCTIP3 1.00800 ! TIP3P water hydrogen
MASS 2 OCTIP3 15.99940 ! TIP3P water oxygen
BONDS
!V(bond) = Kb(b - b0)**2
!Kb: kcal/mole/A**2
!b0: A
!
! atom types Kb b0
!
! Original TIP3P water parameters
HCTIP3 HCTIP3 0.00 1.5139 ! TIPS3P GEOMETRY (FOR SHAKE/W PARAM)
OCTIP3 HCTIP3 450.00 0.9572 ! TIPS3P GEOMETRY
ANGLES
!V(angle) = Ktheta(Theta - Theta0)**2
!V(Urey-Bradley) = Kub(S - S0)**2
!Ktheta: kcal/mole/rad**2
!Theta0: degrees
!Kub: kcal/mole/A**2 (Urey-Bradley)
!S0: A
!
! atom types Ktheta Theta0 Kub S0
!
! tip3p water
HCTIP3 OCTIP3 HCTIP3 55.00 104.52 ! TIP3P GEOMETRY
END
As you can see, we have added the lone pair site and modified the partial charges. We also added
new BOND and ANGLE terms for the new site.
And the parameter file “tip4p2005.par” can look like this:
* TIP4P2005 WATER PARAMETER FILE
*
BONDS
!V(bond) = Kb(b - b0)**2
!Kb: kcal/mole/A**2
!b0: A
!
! atom types Kb b0
!
HTIP4P HTIP4P 0.00 1.5139 ! TIP4P GEOMETRY (FOR SHAKE/W PARAM)
OTIP4P HTIP4P 450.00 0.9572 ! TIP4P GEOMETRY
OTIP4P LP 450.00 0.1546 ! TIP4P/2005 GEOMETRY
ANGLES
!V(angle) = Ktheta(Theta - Theta0)**2
!V(Urey-Bradley) = Kub(S - S0)**2
!Ktheta: kcal/mole/rad**2
!Theta0: degrees
!Kub: kcal/mole/A**2 (Urey-Bradley)
!S0: A
!
! atom types Ktheta Theta0 Kub S0
!
HTIP4P OTIP4P HTIP4P 55.00 104.52 ! TIP4P GEOMETRY
HTIP4P OTIP4P LP 55.00 52.26
The force constants for bonds and angles have to be given, but their exact value is not important,
as this is a rigid model. Only the bond lengths and angles are important. The Lennard-Jones
parameters are given in the NONBONDED section, as explained before for the argon case. Note
that we have removed all the variables that followed the NONBONDED keyword in the TIP3P
parameter file. This is because in the CHARMM parameter file this statement includes a list of
parameters which are used as default by the program CHARMM, but are ignored by NAMD.
We need now to generate the PSF file from the topology and PDB files. Again we can use the
TkConsole in VMD. Type the following commands:
% cd PATH_TO_DIRECTORY_CONTAINING_PDB_AND_TOPOLOGY_FILES
% package require psfgen
% resetpsf
% topology tip4p2005.top
% segment H2O {pdb waterTIP4P2005_360.pdb}
% coordpdb waterTIP4P2005_360.pdb H2O
% writepsf waterTIP4P2005_360.psf
Configuration file: We will perform a 100 ps NPT simulation at 298 K and 1 bar. Edit a file
named „waterTIP4P2005_360_run1.conf‟ and copy the following contents:
# Conditions
set temperature 298.0
set pressure 1.0
set steps 50000
seed 12345
# Initial config
coordinates waterTIP4P2005_360.pdb
temperature $temperature
# Box
cellBasisVector1 22.083 0 0
cellBasisVector2 0 22.083 0
cellBasisVector3 0 0 22.083
cellOrigin 0.0 0.0 0.0
wrapAll off
# Integrator params
timestep 2.0
nonbondedFreq 1
fullElectFrequency 1
# Output params
outputname waterTIP4P2005_360_run1
restartname waterTIP4P2005_360_run1_restart
binaryoutput no
DCDfreq 1000
restartfreq 1000
outputEnergies 10
outputPressure 10
# PME electrostatics
PME yes
PMEGridSizeX 25
PMEGridSizeY 25
PMEGridSizeZ 25
# Run simulation
reinitvels $temperature
run $steps
Note that in the force field section we have included the directives waterModel tip4p and
rigidBonds all and added a block with the necessary commands to maintain a constant pressure
using a Langevin piston. And we have also added a block to tell NAMD how to treat the
electrostatic interactions, as now we have some charged sites in our system. Here we employ the
Particle Mesh Ewald method and we have to give the number of grid points for the mesh in each
direction. A higher grid density increases the accuracy of the PME, but normally one does not
need to go beyond 1 grid point per Å3. Due to the implementation of the algorithm, the most
efficient grid sizes are of the form 2a3b5c, so here we have chosen a value of 25.
We can copy now the files „waterTIP4P2005_360.pdb‟, „waterTIP4P2005_360.psf‟,
„tip4p2005.par‟, and „waterTIP4P2005_360_run1.conf‟ to a working directory (e.g. Run1) and
run >> namd2 waterTIP4P2005_360_run1.conf > waterTIP4P2005_360_run1.log &
(attention: the simulation will probably last more than 30 minutes).
When the run is over, use VMD and the NAMD plot to see the evolution of T, P, V and the
energies. As you can see, initially the system is far from equilibrium, but after ~5000 steps the
system seems to have reached equilibrium and all the variables start to fluctuate around average
values. The volume increases abruptly at the beginning, passing from the initial cube length L
=22.083 Å to L ~ 24 Å, but then it decreases again to a value close to the initial one.
We can calculate easily the properties over the second half of the simulation using the script
namdstats. Type: >> namdstats from 25000 waterTIP4P2005_360_run1.log
You can see that the pressure is close to 1 bar and that the average volume is ≈ 10900 Å3. If we
use namddat to extract the volume data from the log file and then we select the second half of
those data to run the namd_rms.py script into them, we obtain V = 10933 ± 130 Å3. Thus we
can estimate the density for our model of liquid water as = 0.985 ± 0.012 g/cm3. This is in
reasonable agreement with experiment and with the values given of 0.998 and 0.993 given by
Abascal and Vega in their papers. In order to get a more precise value we should study a larger
system and above all make a much longer simulation
Now use the final configuration from the NPT run as starting point for a NVT simulation that we
will use to calculate the properties of liquid water at ambient conditions. Prepare the
configuration file to read the positions and velocities from the previous NPT (see the argon
example) and launch a 100 ps simulation. Once that it is finished, you can use VMD to load the
DCD trajectory and check how the molecules move. Check also if the temperature and pressure
are close to 298 K and 1 bar. You can use the tools that we already know (namdstats, namddat,
namd_rms.py) to obtain the average thermodynamic quantities. Normally you should get values
similar to those:
≈ 0.986 g/cm3 (but you could have any density in the range 0.97-1.00, depending on the
instantaneous volume of the final configuration of the NPT run)
T ≈ 296 ± 8 K
P ≈ 122 ± 655 bar (the value of the pressure is a little bit low, but still in the range of ambient
pressure if we account for the r.m.s. fluctuations)
U ≈ 4100 ± 25 kcal/mol_box = 11.39 ± 0.07 kcal/mol
Perhaps you will be surprised by the large „error bar‟ in the calculated pressure. This is due to the
small system size compared to a macroscopic system. A nice explanation of this is given in the
documentation of Gromacs (http://www.gromacs.org/Documentation/Terminology/Pressure),
another MD package:
“Whether or not pressure coupling is used within a simulation, the pressure value for the simulation box
will oscillate significantly. Instantaneous pressure is meaningless, and not well-defined. Over a
picosecond time scale it usually will not be a good indicator of the true pressure. This variation is
entirely normal due to the fact that pressure is a macroscopic property and can only be measured
properly as time average, while it is being measured and/or adjusted with pressure coupling on the
microscopic scale. How much it varies and the speed at which it does depends on the number of atoms
in the system, the type of pressure coupling used and the value of the coupling constants. Fluctuations
of the order of hundreds of bar are typical. For a box of 216 waters, fluctuations of 500-600 bar are
standard. Since the fluctuations go down with the square root of the number of particles, a system of
21600 water molecules (100 times larger) will still have pressure fluctuations of 50-60 bar.”
The internal energy can be used to compute the vaporization enthalpy, which is a well-known
experimental quantity for most compounds. In first approximation, Hv ≈ U + RT, so from our
results we have Hv ≈ 12 kcal/mol. This is somewhat too high compared to the experimental
enthalpy of vaporization (10.52 kcal/mol), but it is in very good agreement with the result given
by Abascal and Vega for the TIP4P/2005 model (see J.L.F. Abascal and C. Vega, J. Chem. Phys.
123, 234505 (2005) for additional details in the necessity of including a self-polarization
correction when comparing liquid and gas phase properties and how the introduction of this
correction improves the prediction of the TIP4P/2005 model for the enthalpy of vaporization).
4.2.2 LAMMPS
In this example we will introduce a new program, Moltemplate, that it can be ver helpful in the
task of creating the input files for LAMMPS. We need to create a LT file (LT stands for
LAMMPS-template format) that will contain all the information relevant for a particular
molecule. One of the examples distributed with Moltemplate shows how to create the LAMMPS
files for SPC/E water and using a PDB file. This is basically what we want to do, although with a
different model, so it is an excellent starting point for us, in particular because in LAMMPS we
do not need to add the site for the lone pair because there is already a tip4p model foreseen by
the program, as we will see later.
So you can start looking to the „spce.lt‟ file in order to get a first idea of the information needed
by Moltemplate in order to work. This file defines a SPCE molecule and contains 6 separate
blocks. In the first one the units and potential functions to use are defined. Then the atoms
composing a molecule, their partial charges and the coordinates reflecting the molecular
geometry are given, followed by several blocks defining the masses, bonds, and angles in the
molecule. And finally the parameters for the potentials are given in the last block.
So we can modify this file to put the corresponding parameters for the TIP4P/2005 model. Edit a
file named „waterTIP4P2005_360.lt‟ and copy the following contents into it:
# file "tip4p2005.lt"
#
# H1 H2
# \ /
# O
#
# + lone pair charge (not needed by LAMMPS because included in tip4p model)
TIP4P2005 {
write_once("In Init") {
# -- Default styles (for solo "TIP4P2005" water) --
units real
atom_style full
# (Hybrid force fields were not necessary but are used for portability.)
pair_style hybrid lj/cut/tip4p/long 1 2 1 1 0.1546 11.0
bond_style hybrid harmonic
angle_style hybrid harmonic
kspace_style pppm 0.0001
pair_modify mix arithmetic
}
write("Data Atoms") {
$atom:O $mol:. @atom:O -1.1128 1.5500000 1.55000 1.50000
$atom:H1 $mol:. @atom:H 0.5564 1.5500000 2.30695 2.08588
$atom:H2 $mol:. @atom:H 0.5564 1.5500000 0.79305 2.08588
}
write_once("Data Masses") {
@atom:O 15.9994
@atom:H 1.008
}
write("Data Bonds") {
$bond:OH1 @bond:OH $atom:O $atom:H1
$bond:OH2 @bond:OH $atom:O $atom:H2
}
write("Data Angles") {
$angle:HOH @angle:HOH $atom:H1 $atom:O $atom:H2
}
write_once("In Settings") {
bond_coeff @bond:OH harmonic 0.0 0.9572
angle_coeff @angle:HOH harmonic 0.0 104.52
pair_coeff @atom:O @atom:O lj/cut/tip4p/long 0.185207 3.1589
pair_coeff @atom:H @atom:H lj/cut/tip4p/long 0.0 2.058
group tip4p type @atom:O @atom:H
fix fShake tip4p shake 0.0001 10 100 b @bond:OH a @angle:HOH
# (Remember to "unfix" fShakeSPCE during minimization.)
}
You can see that we define a pair potential using the keyword lj/cut/tip4p/long. The parameters
given after this statement correspond to the atom types for O and H (here 1 and 2, respectively,
as there are no other atoms), then the bond and angle types, followed by the O-M distance
(0.1546 Å for TIP4P/2005) and finally the cut-off to be employed. You can consult the
LAMMPS manual for more details, as it contains a section describing the details of the TIP4P
model inside LAMMPS.
The next series of commands are not too different from those for SPC/E water. In the last block
we give the corresponding bond length and angle for our water model and the LJ parameters.
LAMMPS will maintain the rigid molecule using the SHAKE algorithm, so we have to define a
group that contains the O and the two H atoms and then we apply the fix shake to it.
After the molecular definition, the last command simply tells the program that we will build a
system containing 360 TIP4P2005 molecules.
We can run now Moltemplate by typing:
>> moltemplate.sh –pdb water_360.pdb waterTIP4P2005_360.lt
and this will create the data and input files for LAMMPS. Note that we have used the original
PDB file, not the one that we generated with the 4th site corresponding to the lone pair. This is
because this site is added automatically by LAMMPS when the pair_style tip4p is selected. Of
course, this means that the initial geometry of the water molecules will be slightly incorrect, but
as the bond lengths and angles in the PDB file are quite close to the TIP4P/2005 values, the
SHAKE algorithm will be able to restore the right geometry for our model. We need now to edit
the input file and add the conditions that we want for our simulation. Our final input file
„waterTIP4P2005_360.in‟ could look like this:
read_data "waterTIP4P2005_360.data"
# Output params
thermo 10
thermo_style custom step temp press vol etotal ke pe ebond eangle evdwl ecoul
elong etail
dump 1 all custom 1000 water.trj id mol type x y z ix iy iz
# Run simulation
velocity all create ${temp} 102486 mom yes rot yes dist gaussian
run ${steps}
write_data waterTIP4P2005_360_run1.data
write_restart waterTIP4P2005_360_run1.restart
We have directly copied to the file the text contained in the .init and .settings generated by
Moltemplate instead of using the include command. But you can work directly with the files
generated by Moltemplate, if you prefer. So now we are ready to run our NPT simulation:
>> lammps-daily < waterTIP4P2005_360_run1.in > waterTIP4P2005_360_run1.log &
The thermodynamic variables show the same trends observed and discussed in the simulation
with NAMD (as it should!). The analysis of the volume fluctuations during the second half of
the simulation gives V = 10906 ± 125 Å3, giving = 0.987 ± 0.012 g/cm3. Now that the
system is reasonably well equilibrated, we will continue the simulation from the last
configuration that we just obtained, but now in the NVT ensemble. Run a 100 ps simulation and
check the results. Using the script lammps_rmsd.py you should obtain values close to those:
≈ 0.994 g/cm3 (or in range 0.97 – 1.00)
T ≈ 298 ± 9 K
P ≈ 68 ± 680 bar
U ≈ 4096 ± 32 kcal/mol_box = 11.38 ± 0.09 kcal/mol
4.2.3 DL_POLY
CONFIG: We can build this file from the PDB one following the same procedure as before:
>> obabel –i pdb waterTIP4P2005_360.pdb –o CONFIG –O waterTIP4P2005_360.cfg
OpenBabel will complain because it does not understand the element label „du‟ that we used for
the lone pair, but it will convert the trajectory writing Xx for the name of the „unknown element‟.
So we can proceed and use our python script to correct the problem with the format of the float
numbers: >> dlpoly_correct_bug_obabel.py waterTIP4P2005_360
And finally we edit the resulting CONFIG file to set the box key = 1 (for a cubic box), add the
vectors defining the box (L = 22.083 Å), and replace the „Xx‟ symbols by „Du‟ again.
After those operations your CONFIG file will loke like this:
waterTIP4P2005_360.pdb
0 1
22.0830000000 0.0000000000 0.0000000000
0.0000000000 22.0830000000 0.0000000000
0.0000000000 0.0000000000 22.0830000000
O 1 8
12.6270000000 21.6820000000 8.5790000000
H 2 1
13.1190000000 21.2840000000 9.2970000000
H 3 1
11.8270000000 22.0110000000 8.9900000000
Du 4 0
12.5860000000 21.6730000000 8.7280000000
O 5 8
19.8000000000 10.4730000000 13.2470000000
...
MOLECULES 1
TIP4P2005 water
NUMMOLS 360
ATOMS 4
O 15.9994 0.0000 1 0 1
H 1.0080 0.5564 1 0 1
H 1.0080 0.5564 1 0 1
Du 0.0000 -1.1128 1 0 1
RIGID BODIES 1
4 1 2 3 4
FINISH
VDW 1
O O lj 0.7749 3.1589
CLOSE
We simply need to tell DL_POLY that we have a single molecular type in our system named
„TIP4P2005 water‟, that there are 360 of those molecules, and that each molecule is composed of
4 atoms named, O, H, H, and Du (the names and the order must agree with the names used in
CONFIG). For each atom we give then its mass and charge.
As in DL_POLY we can work with rigid bodies we do not need to worry about defining any
intramolecular parameter. We can simply use the keyword RIGID followed by the number of
different rigid units (here only 1) and in the following line we tell the program how many atoms
the rigid unit contains and which are their IDs. This ends the molecular definition and it only
remains to give the LJ parameters for the single Van der Waals interaction that we have in this
model.
CONTROL: And these are the instructions to run the simulation:
TIP4P/2005 water NPT, T=298 K
temperature 298.0 K
pressure 0.001 kbar
steps 50000
print 10
stack 10
stats 10
timestep 0.002
cutoff 11.0
delr 2.0
finish
Here we are using the Nose-Hoover thermostat and barostat with the same relaxation constant (1
ps) to drive the temperature and pressure to the requested values. And as now we have
electrostatic charges in our system, we need to select a method to treat the Coulombic potential.
Here we use the Smooth Particle Mesh Ewald (SPME) method and we let the program to choose
the correct parameters in order to get a relative error in the energy < 10 6.
We are ready to run the simulation. Type: >> DLPOLY.X
What happens? The program ends almost immediately and only the OUTPUT file has been
generated. Error messages are sent to this file, so open it in an editor to find what is the problem.
You will see that all the information has been correctly read, but that even before starting the
simulation there is the message „error – quaternion setup failed‟. Look for a description of the
error and possible causes in DL_POLY‟s manual.
Can you find out why the program fails and what should we do to solve this problem?
…
The reason is that we do not have enough precision in the positions of our atoms in the
CONFIG file! Initially we generated a PDB file with the correct geometry for TIP4P/2005 water
and then we converted into our CONFIG, but the PDB format only allows for 3 decimal places
for the positions. This is usually OK and for example we did not have this problem when using
LAMMPS or NAMD, as we used the SHAKE algorithm in those runs. The difference is that
with the SHAKE algorithm the atomic positions are propagated normally and then corrected to
bring any intramolecular distance to the required value. Instead when we work with quaternions
we compute the total force and torque in the rigid body, so we have a single entity. If we have
more than one molecule, then they all must have exactly the same geometry. Note that this
possibility is not available in NAMD, but it does exist in LAMMPS (fix rigid).
Therefore we have to modify the script that generated the PDB file for TIP4P2005 water in order
to write directly a CONFIG file, where we can use enough decimals for the positions.
Use >> pdb_water_to_tip4p_dlpoly.py water_360.pdb CONFIG_NEW tip4p2005 and then
edit the new CONFIG file to modify the box key and add the box vectors, as before. We can try
to run again our simulation using this CONFIG …
The simulation now starts, but after several thousands of steps it fails with a new error message:
„error – potential cutoff exceeds half-cell width‟. We are using a cutoff radius of 11 Å, so this
means that due to the volume fluctuations the side of our simulation box has become smaller
than 22 Å, making DL_POLY to stop immediately. In order to solve this issue we can try to start
equilibrating first in the NVT ensemble in order to avoid the large volume changes that take
place during the initial phase equilibrium. Then once that the system has reached a more
reasonable configuration we can switch to the NPT ensemble. Or otherwise we can simply
reduce the cutoff to a smaller value, e.g. Rc = 10Å (which is small, but still reasonable), and hope
that the box side will not go below 20 Å.
In this case the second option should work, so just set „cut 10.0‟ in the CONTROL file. Add also
the directive „equilibrate 25000‟ in order to make the program to compute directly the
thermodynamic averages only over the second half of the simulation. Try again : >> DLPOLY.X
Check the variation of volume and energies using the Java GUI. From the OUTPUT file we get
directly V = 10822 ± 146 Å3, corresponding to an average density = 0.995 ± 0.014 g/cm3.
Now copy the file REVCON to a different folder and run a 100 ps on the NVT ensemble. As the
box side will not change now, you can set back the cut-off to 11 Å. Once the run is over (note
that it could take more than 2 hours to complete), check the OUTPUT file. The values there
should agree (within the rms fluctuations) with those:
≈ 0.988 g/cm3 (or in range 0.97 – 1.00)
T ≈ 298 ± 9 K
P ≈ 200 ± 660 bar
U ≈ 17170 ± 143 kJ/mol_box = 11.40 ± 0.09 kcal/mol
4.2.4 DL_POLY
As in the previous example, use nMoldyn to compute the main structural and dynamical
properties. As now we have a molecular system and two different atoms in our system, most of
the analysis will give you several output files corresponding to the total quantity, the intra-/inter-
molecular contributions, and the partial contributions. Note also that now you can be interested
in computing a given property using different weighting schemes and transmutations. For
example, you will need to select weight = „b_coherent‟ and transmutate H atoms into D in order
to compute the static structure factor expected from a neutron diffraction experiment. But you
will use weight = „atomic number‟ to compute the X-ray equivalent. Note also that for X-rays
you can use the X-ray structure factor calculation if you want to take into account the x-ray form
factor. Finally, do not forget that we also have a „dummy‟ atom in our water model, so you
should do an atom selection and exclude the lone pair from it before computing any property.
4.3 A fully flexible molecule (ethanol)
Our following example concerns again a molecular liquid formed by a small molecule, ethanol
(CH3CH2OH), but now we will treat the molecule as fully flexible. Note that even if this
molecule is quite small, it contains almost all the intramolecular terms that you can find (bond
stretching, angle bending, and dihedral torsions). Thus this example can serve as a basis to build
much more complex systems, where the only important additional terms that you may need to
add in some cases is the improper dihedral to maintain a given geometry (e.g. a planar aromatic
ring). In this example we will use the OPLS All-Atom force field, originally developed by
Jorgensen‟s group [W.L. Jorgensen et al., J. Am. Chem. Soc. 118, 11225 (1996)]. OPLS stands
for Optimized Potential for Liquid Simulations, but this is a very popular force field often used
to study also biomolecules. You can extract the parameters that we need to simulate ethanol from
the original 1996 paper, but they are nicely summarized in Table 1 in [J. T. Fern et al., J. Phys.
Chem. B 111, 13278 (2007)].
In this example we will assume that we do not have any available PDB, and we will use
Moltemplate to generate from scratch all the files needed for LAMMPS. We will use then the
configuration constructed by Moltemplate to create a PDB file for NAMD and DL_POLY.
4.3.1 LAMMPS
As for the case of water, we need to generate a LT file for our molecule. Ethanol is still small
enough to allow one to do this by hand, so you can try to write such file using the water example
and the parameters given in the references above. However it is clear that increasingly large
molecules, this task will become quickly impractical.
But fortunately Moltemplate can help us with this, as it already contains some force fields that
have been converted into LT format. Thanks to this you can build your system by giving only the
list of atoms and bonds composing your molecule. You can check the program distribution to
find out which are force fields that have been converted. Luckily OPLS-AA is one of those, so
we can use it to build easily our system.
In the case of the OPLS-AA potential the procedure is not fully automatic, as the LT file for the
full OPLS force field would be too big to manage by Moltemplate. Therefore you have to apply
the following steps (read the README.TXT file under common/oplsaa/ in the moltemplate
distribution for additional details):
1. Copy the text file oplsaa.txt and rename it to „oplsaa_ethanol.txt‟. Then edit this file and from
the list of atoms (there are 906 atom types under the header Atom Type Definitions), remove all
those that are not relevant for your system. In our case we will just keep 5 of them, so our file
will look like this:
##############################
## ##
## Force Field Definition ##
## ##
##############################
#forcefield OPLS-AA
#vdwindex TYPE
#vdwtype LENNARD-JONES
#radiusrule GEOMETRIC
#radiustype SIGMA
#radiussize DIAMETER
#epsilonrule GEOMETRIC
#torsionunit 0.5
#imptorunit 0.5
#vdw-14-scale 2.0
#chg-14-scale 2.0
#electric 332.06
#dielectric 1.0
#############################
## ##
## Literature References ##
## ##
#############################
#The parameters supplied with TINKER are from "OPLS All-Atom Parameters
#for Organic Molecules, Ions, Peptides & Nucleic Acids, July 2008" as
#provided by W. L. Jorgensen, Yale University during June 2009. These
#parameters are taken from those distributed with BOSS Version 4.8.
#Note that "atom type" numbers and not "atom class" numbers are used
#to index van der Waals parameters, see the "vdwindex" keyword above
#The atom types with (UA) in the description are "united atom" values,
#ie, OPLS-UA, where any nonpolar hydrogen atoms are combined onto their
#attached atoms. All other parameters are "all-atom", OPLS-AA, including
#explicit hydrogen atoms.
#############################
## ##
## Atom Type Definitions ##
## ##
#############################
################################
## ##
## Van der Waals Parameters ##
## ##
################################
2. Then run the python script oplsaa_moltemplate_v015_corr provided in the Scripts folder by
typing >> oplsaa_moltemplate_v015_corr.py oplsaa_ethanol.txt and this will create the
file „oplsaa.lt‟.
3. Now we need to create another LT file named „ethanol.lt‟ that should look like this:
# file "ethanol.lt"
#
# H1 H4
# | |
# H3 -- C1 -- C2 -- OH -- HO
# | |
# H2 H5
import "oplsaa.lt"
write("Data Atoms") {
$atom:C1 $mol:. @atom:77 0.0 0.995 0.329 -0.000
$atom:C2 $mol:. @atom:96 0.0 -0.340 -0.404 0.000
$atom:OH $mol:. @atom:93 0.0 -1.394 0.538 -0.000
$atom:HO $mol:. @atom:94 0.0 -2.235 0.073 0.000
$atom:H1 $mol:. @atom:82 0.0 1.844 -0.392 0.000
$atom:H2 $mol:. @atom:82 0.0 1.096 0.975 -0.902
$atom:H3 $mol:. @atom:82 0.0 1.096 0.976 0.901
$atom:H4 $mol:. @atom:82 0.0 -0.456 -1.038 0.907
$atom:H5 $mol:. @atom:82 0.0 -0.457 -1.039 -0.907
}
write("Data Bonds") {
$bond:CH1 @bond:77-82 $atom:C1 $atom:H1
$bond:CH2 @bond:77-82 $atom:C1 $atom:H2
$bond:CH3 @bond:77-82 $atom:C1 $atom:H3
$bond:CH4 @bond:96-82 $atom:C2 $atom:H4
$bond:CH5 @bond:96-82 $atom:C2 $atom:H5
$bond:CC @bond:77-96 $atom:C1 $atom:C2
$bond:CO @bond:93-96 $atom:OH $atom:C2
$bond:OH @bond:93-94 $atom:OH $atom:HO
}
The first line makes that our ethanol definition „inherits‟ of all the information contained in the
„oplsaa.lt‟ file. Thus here we only need to give the list of the 9 atoms composing the molecule,
taking care of giving the correct atom types corresponding to each atom (naturally they have to
agree with the types that we selected in „oplsaa_ethanol.txt‟). After each atom type we have to
give the partial charge and coordinates for each atom. Note that here the values given for the
partial charges are just zeroes, as they will be set by Moltemplate. The coordinates however need
to correspond to those of one ethanol molecule having a reasonable geometry, as they will be
used to construct the simulation box.
And then we also need to give the list of all the bonds present in the molecule, their type
(@bond:) and the identifiers of the two atoms making the bond.
Here we import the „ethanol.lt‟ molecule and then generate a cubic lattice of 6 6 6 ethanol
molecules with a separation of 5 4.Å. We have to give also the corresponding limits for the full
box (6 4.5 Å in each direction).
Now we are ready to generate the LAMMPS files. Type >> moltemplate.sh system.lt and
you will see that four files are generated: system.data (containing the atomic coordinates) + three
“in” files.
We can check the structure produced by Moltemplate using VMD. Start VMD, launch the
TkConsole and type the following command:
% topo readlammpsdata system.data full
You will see the ordered structure prepared by Moltemplate. This is certainly not the structure of
liquid ethanol, but we can make a NPT simulation in order to “melt” the artificial crystal that we
just created.
Copy and rename the files into the directory Run1 as follows:
system.data ethanol_216.data (System description and atomic coordinates)
system.in.settings ethanol_oplsaa.coeffs (Set of potential parameters)
system.in.init ethanol_oplsaa.init (Definitions: units, potential styles, etc.)
system.in ethanol_216_run1.in (Simulation setup)
Now we have to edit the last file, „ethanol_216_run1.in‟, in order to modify the file names
accordingly and, above all, to define the conditions for the run that we want to perform. The file
could look like this:
read_data "ethanol_216.data"
include "ethanol_oplsaa.coeffs"
# Output params
thermo 10
thermo_style custom step temp press vol etotal ke pe ebond eangle edihed emol evdwl ecoul elong
dump 1 all custom 1000 ethanol.trj id mol type x y z ix iy iz
# Run simulation
velocity all create ${temp} 102486 mom yes rot yes dist gaussian
run ${steps}
write_data ethanol_216_run1.data
write_restart ethanol_216_run1.restart
This will run a 50 ps NPT simulation at ambient conditions. We can start the run in the usual
way: >> lammps-daily < ethanol_216_run1.in ethanol_216_run1.log &
Once the run is over you can plot T, P, and V vs time-step in order to check how the system
evolves to equilibrium. Use VMD and the TkConsole with the command:
% topo readlammpsdata ethanol_216_run1.data full
to load and visualize the last configuration and confirm that the structure is disordered now.
Copy the data file to a new directory Run2 and edit it to change the box sides to 0.0 27.60 and to
remove the „Improper Coeffs‟ line. Copy to this directory the „ethanol_oplsaa.init‟ and
„ethanol_oplsaa.coeffs‟ files and change the input file in order to perform a 50 ps NVT run,
saving the trajectory every 100 steps. Run this simulation and check the results. How do they
compare with those given in Jorgensen‟s paper for ethanol?
4.3.2 NAMD
We can use the existing topology and parameter CHARMM files as our starting point. For
example, in the file „top_all36_cgenff.rtf‟ we find the definition for the ethanol residue:
* --------------------------------------------------------------------------
*
* CGenFF: Topology for the Charmm General Force Field v. 2b8
*
* for Small Molecule Drug Design
*
* --------------------------------------------------------------------------
*
*
36 1
...
RESI ETOH 0.00 ! C2H6O Ethanol, adm jr.
GROUP
ATOM C1 CG321 0.05 ! H21 H11 H12
ATOM O1 OG311 -0.65 ! \ \ /
ATOM HO1 HGP1 0.42 ! H22--C2--C1
ATOM H11 HGA2 0.09 ! / \
ATOM H12 HGA2 0.09 ! H23 O1--HO1
GROUP
ATOM C2 CG331 -0.27
ATOM H21 HGA3 0.09
ATOM H22 HGA3 0.09
ATOM H23 HGA3 0.09
BOND C1 C2 C1 O1 C1 H11 C1 H12 O1 HO1
BOND C2 H21 C2 H22 C2 H23
DONO HO1 O1
ACCE O1
! for ic build
IC O1 C1 C2 H21 0.0000 0.0000 180.0000 0.0000 0.0000
IC C1 H21 *C2 H22 0.0000 0.0000 120.0000 0.0000 0.0000
IC C1 H21 *C2 H23 0.0000 0.0000 240.0000 0.0000 0.0000
IC O1 C2 *C1 H11 0.0000 0.0000 240.0000 0.0000 0.0000
IC O1 C2 *C1 H12 0.0000 0.0000 120.0000 0.0000 0.0000
IC C2 C1 O1 HO1 0.0000 0.0000 180.0000 0.0000 0.0000
...
And the corresponding parameters are given in the file „par_all36_cgenff.prm‟. We could use
directly those files, but they correspond to the CHARMM general force field, so if we want to
simulate ethanol with the OPLS-AA potential we will need to adapt them. Thus we can create a
new topology file for ethanol, „ethanol_oplsaa.top‟, where we replace the partial charges by
those corresponding to the OPLS-AA model:
* ----------------------------------------------------------------------- *
* Ethanol residue with OPLS-AA charges *
* ----------------------------------------------------------------------- *
*
36 1
And we have to give the correct potential parameters as well, so write the following file
„ethanol_oplsaa.par‟:
* ----------------------------------------------------------------------- *
* CGenFF: Parameters for OPLS-All Atom model *
* ----------------------------------------------------------------------- *
*
BONDS
!
!V(bond) = Kb * (b - b0)**2
!
!Kb: kcal/mole/A**2
!b0: A
!
!atom types Kb b0
CG321 CG331 268.00 1.5290
CG321 OG311 320.00 1.4100
CG321 HGA2 340.00 1.0900
CG331 HGA3 340.00 1.0900
OG311 HGP1 553.00 0.9450
ANGLES
!
!V(angle) = Ktheta(Theta - Theta0)**2
!
!V(Urey-Bradley) = Kub(S - S0)**2
!
!Ktheta: kcal/mole/rad**2
!Theta0: degrees
!Kub: kcal/mole/A**2 (Urey-Bradley)
!S0: A
!
!atom types Ktheta Theta0 Kub S0
!
CG331 CG321 OG311 50.00 109.50
CG331 CG321 HGA2 37.50 110.70
HGA2 CG321 HGA2 33.00 107.80
OG311 CG321 HGA2 35.00 109.50
CG321 CG331 HGA3 37.50 110.70
HGA3 CG331 HGA3 33.00 107.80
CG321 OG311 HGP1 55.00 108.50
DIHEDRALS
!
!V(dihedral) = Kchi(1 + cos(n(chi) - delta))
!
!Kchi: kcal/mole
!n: multiplicity
!delta: degrees
!
!atom types Kchi n delta
HGA2 CG321 CG331 HGA3 0.159 3 0.00
OG311 CG321 CG331 HGA3 0.234 3 0.00
HGA2 CG321 OG311 HGP1 0.225 3 0.00
CG331 CG321 OG311 HGP1 -0.178 1 0.00
CG331 CG321 OG311 HGP1 -0.087 2 180.00
CG331 CG321 OG311 HGP1 0.246 3 0.00
!
! V(Lennard-Jones) = Eps,i,j[(Rmin,i,j/ri,j)**12 - 2(Rmin,i,j/ri,j)**6]
!
! epsilon: kcal/mole, Eps,i,j = sqrt(eps,i * eps,j)
! Rmin/2: A, Rmin,i,j = Rmin/2,i + Rmin/2,j
!
! atom ignored epsilon Rmin/2 ignored eps,1-4 Rmin/2,1-4
!
HGA2 0.0 -0.030 1.4031 ! alkane H
HGA3 0.0 -0.030 1.4031 ! alkane H
HGP1 0.0 -0.000 0.0000 ! polar H
CG321 0.0 -0.066 1.9643 ! RCH2OH
CG331 0.0 -0.066 1.9643 ! RCH3
OG311 0.0 -0.170 1.7510 ! hydroxyl O
END
And now we will use VMD to read the configuration produced by LAMMPS and write the PDB
and PSF files for NAMD. So copy the file „ethanol_216_run2.data‟ to the NAMD folder, start
VMD, launch the TkConsole and run the script „ethanol_data_to_pdb.tcl‟:
% source ethanol_data_to_pdb.tcl
Basically it reads the LAMMPS data file, gives the correct name and type to each atom (in
accordance with the topology and parameter files that we have defined), and writes the PDB and
PSF files that we need. So now we just need to prepare the control file for the simulation. Name
it „ethanol_216_run1.conf‟ and copy the following input:
# Conditions
set temperature 298.0
set pressure 1.0
set steps 50000
seed 12345
# Initial config
coordinates ethanol_216.pdb
temperature $temperature
# Box
cellBasisVector1 27.6 0 0
cellBasisVector2 0 27.6 0
cellBasisVector3 0 0 27.6
cellOrigin 0.0 0.0 0.0
wrapAll off
# Integrator params
timestep 1.0
nonbondedFreq 1
fullElectFrequency 1
# Force field params
paraTypeCharmm on
structure ethanol_216.psf
parameters ethanol_oplsaa.par
rigidBonds none
exclude scaled1-4
1-4scaling 0.5
switching on
switchdist 10.0
cutoff 12.0
pairlistdist 14.
stepspercycle 10
vdwGeometricSigma yes
# Output params
outputname ethanol_216_run1
restartname ethanol_216_run1_restart
binaryoutput no
DCDfreq 100
restartfreq 1000
outputEnergies 10
outputPressure 10
# PME electrostatics
PME yes
PMEGridSizeX 30
PMEGridSizeY 30
PMEGridSizeZ 30
# Run simulation
reinitvels $temperature
run $steps
As usual, once the run is over verify the final structure and the thermodynamic quantities. If you
have done the simulation with LAMMPS as well, a good exercise is to compare the different
energy components obtained with each code and check that they agree within the error bars.
4.3.3 DL_POLY
To prepare the DL_POLY files needed to simulate ethanol we will introduce a new useful
program, DL_FIELD. The DL_FIELD distribution includes an example with ethanol (+ 3 water
molecules), so you can go to the directory where the program is installed
(~/MDANSE2014/Software/dl_field_2.3/), edit the „dl_field.control‟ file to select the correct file
(Examples/ethanol.pdb in the 7th line) and type: >> dl_field_2.3
Three files named dl_field.CONFIG, dl_field.CONTROL and dl_field.FIELD will be created in
the output/ directory. The last file can be used as a reasonable template to write the FIELD that
we need, but we would be forced to change all the parameters of the potential, as it has been
created using the CHARMM force field. The force field library included with DL_POLY
contains also the OPLS-AA force field (and AMBER, Dreiding, and PCFF) as well, so we could
try to set „oplsaa‟ in the second line of „dl_field.control‟ and generate the right FIELD.
Unfortunately, if you try this, you will get an error message:
Error: The residue ETOH from user configuration is not available in the
Molecule database.
So we need to add our residue to the database in a way similar to the example given in the file
„dl_field.user‟. We will see now how to do this.
In a working directory (~/Examples/Ethanol/DL_POLY/ in the VM) we need to copy the lib/
directory containing the force field library and the files „dl_field.atom_type‟ and
„dl_field.control‟. We will copy also the PDB file obtained before and containing the positions of
216 ethanol molecules, „ethanol_216.pdb‟. Now we can create a new file „ethanol_oplsaa.udff‟
where we will define a new residue for ethanol. Copy the following contents:
POTENTIAL OPLSAA
As you can see, we simply need to define the ethanol molecule, paying attention to give the
correct atom types (CH3_aliphatic, C_alcohol3, etc.) which must correspond the the atom types
defined in the file DLPOLY_OPLSAA.sf (in the force field library). Then we need to give the
list of bonded atoms for each atom in the molecule and define the molecule type and key. Once
you are done, copy this file to a file named „dl_field.udff‟ (this name is compulsory for
DL_FIELD to find it). We need also to edit the file „dl_field.control‟ in order to tell the program
that it needs to include user-defined information (5th line of the file). It should look like this:
Ethanol
oplsaa * Type of force field require (see list below for choices).
1 * Bond type (0=default, 1=harmonic , 2=Morse)
1 * Angle type (0=default, 1=harmonic, 2=harmonic cos)
1 * Include user-defined information (dl_field.udff): 1=yes 0=no
1 * Verbosity mode: 1 = on, 0 = off
ethanol_216.pdb * Configuration file.
none * Output file in PDB. Put 'none' if not needed.
1 * Optimise FIELD output size, if possible? 1=yes 2=no
1 * Atom display: 1 = DL_FIELD format. 2 = Standard format
2 * Vdw display format: 1 = 12-6 format 2 = LJ format
0 * Display additional info. for protein 1=Yes 0=No
0 * Freeze atoms? 1 = Yes (see below) 0 = No
0 * Tether atoms? 1 = Yes (see below) 0 = No
0 * Constrain bonds? 1 = Yes (see below) 0 = No
0 * Apply rigid body? 1 = Yes (see below) 0 = No
1 * Periodic condition ? 0=no, other number = type of box (see below)
27.60 0.00 0.00 * Cell vector a (x, y, z)
0.00 27.60 0.00 * Cell vector b (x, y, z)
0.00 0.00 27.60 * Cell vector c (x, y, z)
default * 1-4 scaling for coulombic (put default or x for scaling=x)
default * 1-4 scaling for vdw (put default or x for scaling=x)
1 * MM energy calculation. 1=Yes, 0=No
10.0 * Cut off for electrostatic energy calculation (angstrom)
10.0 * Cut off for vdw energy calculation (angstrom)
Now you are ready to try to run DL_FIELD in order to generate the input files that we need.
Type c and see what happens. You will find out that the program complains because some
parameters are missing. A typical error message will look like this:
Assigning potential parameters...
Error: hc c3 ca3 angle parameter not found.
of angle array index = 3 out of total angle = 13
Atom index: 5-1-2
The reason for this is that the file „DLPOLY_OPLSAA.par‟ does not contain those parameters,
as they are not present in the set of molecules defined in „DLPOLY_OPLSAA. This means that
we will have to look for appropriate values (e.g. using Jorgensen‟s paper) and complete the
database. You can edit the .par file and add the new parameters there, but here we will added
directly to our ethanol definition. Here is the list of missing interactions:
BOND k b0 remark
C3 CA3 268.00 1.529 missing bond in DLPOLY_OPLSAA.par
END BOND
ANGLE 0.5K theta0 Remark
CA3 C3 HC 37.50 110.70 missing angle
C3 CA3 HC 37.50 110.70 missing angle
C3 CA3 OA 50.00 109.50 missing angle
END ANGLE
DIHEDRAL V1 V2 V3 Remark
HC C3 CA3 HC 0.000 0.000 0.318 missing dihedral
HC C3 CA3 OA 0.000 0.000 0.468 missing dihedral
C3 CA3 OA HA -0.356 -0.174 0.492 missing dihedral
END DIHEDRAL
Copy them to the file „ethanol_oplsaa.udf‟ and the later to „dl_field.udff‟. With this we have
added our first molecule to our personal database and we are ready to run DL_FIELD. Type
again >> dl_field_2.3 and if everything goes as expected you should get a message saying that
the program was executed successfully and three new files in the output directory. Copy the
latter files to your working directory (e.g. Run1) with the names CONFIG, FIELD, and
CONTROL and edit the last one in order to give the commands required for our simulation. As
before, we want to simulate ethanol at 298 K and the correct experimental density (0.785 g/cm3)
in the NVT ensemble during 50 ps. Thus the CONTROL file could be like this one:
Ethanol 216 molecules
temperature 298.0
ensemble nvt berendsen 1.0
timestep 0.001
steps 60000
equilibration 10000
scale every 10
cutoff 12.0
delr 2.0
finish
This run will last several hours. As usual, once the run is over verify the final structure and the
thermodynamic quantities. If you have done the simulation with LAMMPS as well, a good
exercise is to compare the different energy components obtained with each code and check that
they agree within the error bars.
4.3.4 nMoldyn
Repeat the same set of analysis that you did for Ar and water.