(Note: These tutorials are meant to provide
illustrative examples of how to use the AMBER software suite to carry out
simulations that can be run on a simple workstation in a reasonable period of
time. They do not necessarily provide the optimal choice of parameters or
methods for the particular application area.)
Copyright Ross Walker 2004
6) A Practical Example (A-DNA)
In the final stage of this tutorial we will use the skills we have learnt to run a simulation on a more practical example. Up to this point we have been using canonical B-DNA as our starting structure. An alternative structure for DNA is canonical A-DNA, compared below:
|
|
Which structure is the more stable, however? In this section we will try running a long MD simulation starting from canonical A form poly(A)-poly(T). We will create the input structure, charge neutralise and solvate it in the same method as we learned earlier. We will then minimise and equilibrate the structure using the same approach as we did for the solvated B-DNA simulation and then finally we shall analyse the results and see what happened. For a more in-depth discussion of A-form and B-form DNA and MD simulations see Cheatham, T.E., Kollman, P.A. J. Mol. Biol., 1996, 259, 434-444.
Note, the simulations in this tutorial may take a considerable amount of time to run - the production MD simulations take about 57,000 s (15.8 hours) [31.6 s per ps] on eight 1.3GHz processors of a SGI ALTIX. While I urge you to set up and run these calculations the output files at each stage are provided so that you can continue with the remainder of the tutorial without waiting for the simulation to finish running.
The first stage of setting up our simulation is to obtain a starting structure. As before we shall use nucgen for this. Previously we instructed nucgen to construct our DNA 10-mer in right handed B-form geometry with the keyword $ABDNA. Changing this to $ADNA will give us what we want. (See the AMBER manual for more details on nucgen usage). Here is the input file for nucgen:
a-dna.nucgen.in |
NUC 1 D A5 A A A A A A A A A3 NUC 2 D T5 T T T T T T T T T3 END $ADNA |
Run nucgen to create our pdb file:
$AMBEROME/exe/nucgen -O -i a-dna.nucgen.in -o a-dna.nucgen.out -d $AMBERHOME/dat/leap/parm/nucgen.dat -p a-dna.nucgen.pdb
This should produce the following two files: a-dna.nucgen.out, a-dna.nucgen.pdb
Check the output file to make sure there are no error messages. Particularly check for the line that says:
GENERATING RIGHT HANDED A-DNA
We should also take a quick look at the pdb structure nucgen created for us. Is it what we want?
vmd a-dna.nucgen.pdb
Looks good - Right handed A-form, exactly what we want.
6.2) Creating the sander input files
The next step is to create the prmtop and inpcrd files for sander. As before we will use xleap for this. The first step is to add the missing TER card between residues 10 and 11 of the pdb. Here is the modified pdb file (a-dna.pdb).
Run xleap:
$AMBERHOME/exe/xleap -s -f $AMBERHOME/dat/leap/cmd/leaprc.ff99
Load our pdb file into a new unit called adna. In the main xleap window enter:
adna = loadpdb a-dna.pdb
Charge neutralise our system:
addions adna Na+ 0
Now we will save a prmtop and inpcrd file for our neutralised (but not solvated) system. We will use the prmtop file later when we visualise our trajectory.
saveamberparm adna a-dna_charges_only.prmtop a-dna_charges_only.inpcrd
Output files: a-dna_charges_only.prmtop, a-dna_charges_only.inpcrd
Next step, solvate our system with a truncated octahedral box containing an 8 angstrom buffer of TIP3P water:
solvateoct adna TIP3PBOX 8.0
This should have added about 3,572 water molecules. Edit the model to check everything looks ok:
edit adna
Finally we save our solvated prmtop and inpcrd files:
saveamberparm adna a-dna_wat.prmtop a-dna_wat.inpcrd
Output files: a-dna_wat.prmtop, a-dna_wat.inpcrd
6.3) Minimising the system
The next stage is to minimise the system prior to performing MD in order to remove any bad contacts created by hydrogenation and solvation. For this we will use the same protocol as we used in the previous section of this tutorial.
Stage 1: 500 steps of steepest descent followed by 500 steps of conjugate gradient minimisation, with a 500 kcal/mol restraint force on the DNA molecule (Res 1-20)
a-dna_min1.in |
A-DNA 10-mer: initial minimisation solvent + ions &cntrl imin = 1, maxcyc = 1000, ncyc = 500, ntb = 1, ntr = 1, cut = 10 / Hold the DNA fixed 500.0 RES 1 20 END END |
sander -O -i a-dna_min1.in -o a-dna_min1.out -p a-dna_wat.prmtop -c a-dna_wat.inpcrd -r a-dna_min1.rst -ref a-dna_wat.inpcrd
Input files: a-dna_min1.in,
a-dna_wat.prmtop,
a-dna_wat.inpcrd
Output files: a-dna_min1.out,
a-dna_min1.rst
This takes approximately 71.5 s on four 1.3 GHz processors of a SGI ALTIX.
Stage 2: 1,000 steps of steepest descent followed by 1,500 steps of conjugate gradient minimisation with no restraints
a-dna_min2.in |
A-DNA 10-mer: initial minimisation solvent + ions &cntrl imin = 1, maxcyc = 2500, ncyc = 1000, ntb = 1, ntr = 0, cut = 10 / |
sander -O -i a-dna_min2.in -o a-dna_min2.out -p a-dna_wat.prmtop -c a-dna_min1.rst -r a-dna_min2.rst
Input files: a-dna_min2.in,
a-dna_wat.prmtop,
a-dna_min1.rst
Output files: a-dna_min2.out,
a-dna_min2.rst
This takes approximately 166 s on four 1.3 GHz processors of a SGI ALTIX.
6.4) Running initial equilibration
The next stage is to start running molecular dynamics. We will initially run 20 ps of heating/equilibration at constant volume in the same fashion as we did for the B-DNA explicit water simulation. This will be done using the same protocols: i.e. 20 ps at constant volume periodic boundaries with weak (10 kcal/mol) restraints on the DNA. An initial temperature of 0 K, final temperature of 300 K, Langevin temperature controls, SHAKE constraints on hydrogen atoms and a 2 fs time step. To save disk space we will only write to our output and mdcrd files every 250 steps (0.5 ps). We will also compress our mdcrd file upon completion using gzip. This will give us a compression ratio of about 60 %. This is important because the next step will be to run 1.8 ns of MD which can create some very big mdcrd files:
a-dna_md1.in |
A-DNA 10-mer: 20ps MD with res on DNA &cntrl imin = 0, irest = 0, ntx = 1, ntb = 1, cut = 10, ntr = 1, ntc = 2, ntf = 2, tempi = 0.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 10000, dt = 0.002, ntpr = 250, ntwx = 250, ntwr = 10000 / Keep DNA fixed with weak restraints 10.0 RES 1 20 END END |
sander -O -i a-dna_md1.in -o a-dna_md1.out -p a-dna_wat.prmtop -c a-dna_min2.rst -r a-dna_md1.rst -ref a-dna_min2.rst -x a-dna_md1.mdcrd
gzip -9 -v a-dna_md1.mdcrd
Input files: a-dna_md1.in,
a-dna_wat.prmtop,
a-dna_min2.rst
Output files: a-dna_md1.out,
a-dna_md1.rst, a-dna_md1.mdcrd.gz
This takes approximately 522 s (8 mins 42 s) on eight 1.3 GHz processors of a SGI ALTIX.
NOTE: For this section of the tutorial I have not included links to the mdcrd files since they are too big to reliable host on a webserver. Not having the mdcrd files will, if you haven't run the simulations yourself, prevent you from interactively conducting some of the analysis below. This should not be a problem, however, since I provide links to the analysis output files. I will also provide links later on to smaller versions of these mdcrd files that have the explict water removed. These will be sufficient to watch the dynamics in VMD.
6.5) Running further equilibration and production MD
In the next stage of our simulation we are going to run a further 1.8 ns of MD. Rather than run a separate equilibration and production phase we will, for convenience, combine the two since our production conditions will be identical to our final equilibration conditions. Here we will use the same conditions as we did for the B-DNA explicit solvent equilibration. Namely we will run without restraints on the DNA. We shall maintain the temperature at 300 K using the Langevin thermostat, run with constant pressure (1 atm) periodic boundaries, use SHAKE constraints on the hydrogen atoms and a 2 fs time step. Since we will be writing 1.8 ns of trajectory we will only write to our output and mdcrd files every 500 fs (250 steps). We shall also compress all of our files to save disk space as 1.8 ns of simulation will lead to some very big files.
Since this simulation will take a long time to run it is probably not a good idea to run all 1.8 ns in one shot. Thus we will actually run 9 x 200 ps simulations with each successive simulation continuing on from the previous one. In this way if a job crashes we only lose the progress of that job and can restart from where we got to. The input file will be the same in each case:
a-dna_md_1800ps.in |
A-DNA 10-mer: 200ps of MD &cntrl imin = 0, irest = 1, ntx = 7, ntb = 2, pres0 = 1.0, ntp = 1, taup = 2.0, cut = 10, ntr = 0, ntc = 2, ntf = 2, tempi = 300.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ntpr = 250, ntwx = 250, ntwr = 10000 / |
We will run the 9 simulations one after the other using the restrt file from the previous run as the input for the next run. While you could run each of these jobs manually using a command line as below (with the relevant filenames incremented by one each time):
$AMBERHOME/exe/sander -O -i a-dna_md_1800ps.in -o a-dna_md2.out -p a-dna_wat.prmtop -c a-dna_md1.rst -r a-dna_md2.rst -x a-dna_md2.mdcrd
gzip -9 -v a-dna_md2.mdcrd
It is probably best to write a script to run all of these jobs for us so that we can leave it running over night. Here is an example script:
run_a-dna_20-1820ps.x |
#!/bin/csh set AMBERHOME="/usr/local/AMBER8" set MDSTARTJOB=2 set MDENDJOB=10 set MDCURRENTJOB=$MDSTARTJOB set MDINPUT=0 echo -n "Starting Script at: " date echo "" while ( $MDCURRENTJOB <= $MDENDJOB ) echo -n "Job $MDCURRENTJOB started at: " date @ MDINPUT = $MDCURRENTJOB - 1 $AMBERHOME/exe/sander -O -i a-dna_md_1800ps.in \ -o a-dna_md$MDCURRENTJOB.out \ -p a-dna_wat.prmtop \ -c a-dna_md$MDINPUT.rst \ -r a-dna_md$MDCURRENTJOB.rst \ -x a-dna_md$MDCURRENTJOB.mdcrd gzip -9 -v a-dna_md$MDCURRENTJOB.mdcrd echo -n "Job $MDCURRENTJOB finished at: " date @ MDCURRENTJOB = $MDCURRENTJOB + 1 end echo "ALL DONE" |
In order to be able to run this script we have to ensure that it is executable:
chmod +x run_a-dna_20-1820ps.x
Now we will run the script - we use nohup so that we can logout without killing the job. The ampersand '&' means run in background. We 'pipe' the output to a log file so that we can simply look at the log file to see the current progress of the job:
nohup ./run_a-dna_20-1820ps.x >& run.log &
This will take a long time to run. Each 200ps stage takes approximate 5,700s (1.58 hours) on eight 1.3 GHz processors of a SGI ALTIX.
Here are all of the output files:
Time Span (ps) | mdout | rstrt | mdcrd |
20 - 220 | a-dna_md2.out | a-dna_md2.rst | a-dna_md2.mdcrd.gz |
220 - 420 | a-dna_md3.out | a-dna_md3.rst | a-dna_md3.mdcrd.gz |
420 - 620 | a-dna_md4.out | a-dna_md4.rst | a-dna_md4.mdcrd.gz |
620 - 820 | a-dna_md5.out | a-dna_md5.rst | a-dna_md5.mdcrd.gz |
820 - 1020 | a-dna_md6.out | a-dna_md6.rst | a-dna_md6.mdcrd.gz |
1020 - 1220 | a-dna_md7.out | a-dna_md7.rst | a-dna_md7.mdcrd.gz |
1220 - 1420 | a-dna_md8.out | a-dna_md8.rst | a-dna_md8.mdcrd.gz |
1420 - 1620 | a-dna_md9.out | a-dna_md9.rst | a-dna_md9.mdcrd.gz |
1620 - 1820 | a-dna_md10.out | a-dna_md10.rst | a-dna_md10.mdcrd.gz |
6.6) Analysing the results
The next stage is to analyse our results and look for anything interesting. First of all lets process the output files:
mkdir summary
cd summary
process_mdout.perl ../a-dna_md1.out ../a-dna_md2.out ../a-dna_md3.out ../a-dna_md4.out ../a-dna_md5.out ../a-dna_md6.out ../a-dna_md7.out ../a-dna_md8.out ../a-dna_md9.out ../a-dna_md10.out
Tar file containing the summary files: a-dna_output_summary.tar.gz
Take a look at the summary files and you will see that after initial relaxation the temperature, pressure, volume etc stay fairly constant. This indicates that there are no major problems with the simulation.
Lets calculate the RMSd of the backbone atoms and see what is happening to the DNA structure. While we are at it we will also get ptraj to re-image the whole trajectory and also remove the waters from the trajectory file. The reason we do this, instead of just hiding the waters in VMD, is that our mdcrd files are now considerably bigger than previously and loading everything into VMD may cause it to run very slowly. It is also fiddly to load 10 separate mdcrd files into a single molecule. This is why we needed the a-dna_charges_only.prmtop file. Thus we will get ptraj to strip the waters and append all of the mdcrd files to a single file. Here is the input file for ptraj (note we don't need to unzip the files before use, ptraj does this on the fly):
combine_mdcrds_and_strip.ptraj |
trajin a-dna_md1.mdcrd.gz trajin a-dna_md2.mdcrd.gz trajin a-dna_md3.mdcrd.gz trajin a-dna_md4.mdcrd.gz trajin a-dna_md5.mdcrd.gz trajin a-dna_md6.mdcrd.gz trajin a-dna_md7.mdcrd.gz trajin a-dna_md8.mdcrd.gz trajin a-dna_md9.mdcrd.gz trajin a-dna_md10.mdcrd.gz trajout a-dna_0-1820ps_no_wat.mdcrd rms first out a-dna_0-1820ps.rmsfit @P,O3',O5',C3',C4',C5' time 0.50 center :1-20 image familiar strip :WAT |
ptraj a-dna_wat.prmtop < combine_mdcrds_and_strip.ptraj
Output files: a-dna_0-1820ps.rmsfit, a-dna_0-1820ps_no_wat.mdcrd.gz [19MB]
First of all, lets plot the backbone atom RMSd fit:
xmgr a-dna_0-1820ps.rmsfit
Interesting... The final structure differs from the initial structure by over 5 angstroms. This is quite a large RMSd and would in some cases indicate a serious problem with the MD simulation. However, the RMSd increases slowly between 20 ps (when the restraints were switched off) and 500 ps. From 500 ps to end of the simulation the RMSd remains more or less constant at between 4.5 and 5.5 angstroms. This suggests that our starting structure changed slowly changed over the first 500 ps of our simulation into a more stable structure. This more stable structure then remains intact for the remainder of the simulation. Could it be that our A-form DNA converted to B-form? Lets visualise the trajectory and note what we see.
As already mentioned the trajectories including water are a little large to open comfortably in VMD so we shall use the stripped trajectory ptraj created for us. For this we need to use the prmtop file we created before solvation a-dna_charges_only.prmtop.
vmd
Load the a-dna_charges_only.prmtop file as parm7. Then load the a-dna_0-1820ps_no_wat.mdcrd file into this molecule as crdbox (Make sure you unzip it first). You should find that the trajectory loads just showing the DNA and charges (no water). Take a good look at the trajectory, have a look down the minor groove of the DNA and observe what happens to the structure:
0 ps Minor Groove (A-Form) |
0 ps Major Groove (A-Form) |
100 ps Minor Groove (Still A-Form) |
200 ps Minor Groove (Still A-Form) |
300 ps Minor Groove (A-Form) |
400 ps Minor Groove (Distorted A-Form) |
500 ps Minor Groove (B-Form) |
600 ps Minor Groove (B-Form) |
700 ps Minor Groove (B-Form) |
800 ps Minor Groove (B-Form) |
1820 ps Minor Groove (B-Form) |
1820 ps Major Groove (B-Form) |
As you can see somewhere between step 600 and 800 (300 ps and 400 ps) our structure begins to change from A-form DNA to B-form, you may need to rotate the structure in order for this to become apparent. By step 1,000 (500 ps) the conversion is complete. From this point on the structure of our DNA remains in B-form.
This demonstrates a practical use of such an MD simulation. We have been able to use molecular dynamics to move away from a local minima (our A-form DNA) to a lower minima (B-form DNA).
Finally, lets calculate the average structure over the 1,800 ps of our simulation in which the DNA was not restrained, mdcrd files 2 to 10. Again, for this we will use ptraj. Here is the input file to create an average pdb from all 3600 structures in our constant pressure mdcrd trajectory file:
a-dna_calc_av_struc.ptraj |
trajin a-dna_md2.mdcrd.gz trajin a-dna_md3.mdcrd.gz trajin a-dna_md4.mdcrd.gz trajin a-dna_md5.mdcrd.gz trajin a-dna_md6.mdcrd.gz trajin a-dna_md7.mdcrd.gz trajin a-dna_md8.mdcrd.gz trajin a-dna_md9.mdcrd.gz trajin a-dna_md10.mdcrd.gz average a-dna_20-1820ps_average.pdb pdb nowrap |
ptraj a-dna_wat.prmtop < a-dna_calc_av_struc.ptraj
Output file: a-dna_20-1820ps_average.pdb
vmd a-dna_20-1820ps_average.pdb
And there you have it, the average structure over the entire constant pressure stage of our simulation.
You now have the necessary knowledge to set-up and run your own simulations. Have a go and see how you get on. To learn about other aspects of AMBER 8.0, including more complex simulations that involve creating your own residues please see the other workshop tutorials (http://www.rosswalker.co.uk/tutorials/amber_workshop/).
Feel free to use what you have learned to experiment with alternative starting geometries. Do they all change to B-DNA if run for long enough? What about in a mixture of 80% water, 20% ethanol?
If you have any comments on this tutorial please email me (Ross Walker).
(Note: These tutorials are meant to provide
illustrative examples of how to use the AMBER software suite to carry out
simulations that can be run on a simple workstation in a reasonable period of
time. They do not necessarily provide the optimal choice of parameters or
methods for the particular application area.)
Copyright Ross Walker 2004