How to do umbrella sampling
Amber
PMF over RMSD
Prepare AMBER topologies
for ii in $(seq 1 32) do mkdir $ii cd $ii cat > tclfix << EOF # Fix your PDB file here so that it can be recognised by tleap mol new ../../pdb/crd.pdb.$ii type pdb set sel [atomselect top all] \$sel set beta 0 set sel [atomselect top "not hydrogen"] \$sel set beta 1 set sel [atomselect top "resid 1334 and name P OP1 OP2"] \$sel set beta 0 set sel [atomselect top "resid 1 to 1300"] \$sel set chain A set sel [atomselect top "resid 1301 to 1333"] \$sel set chain B set sel [atomselect top "resid 1334 to 1356"] \$sel set chain C set sel [atomselect top "resid 1357 to 1379"] \$sel set chain D set sel [atomselect top "beta > 0"] \$sel writepdb mol.pdb quit EOF cat > leapin << EOF # Tleap parameter file; Box size, ion concentration, etc. source leaprc.protein.ff14SB source leaprc.RNA.OL3 source leaprc.DNA.bsc1 source leaprc.water.tip3p mol = loadpdb mol.pdb solvatebox mol TIP3PBOX 10 addions mol K+ 157 Cl- 93 saveamberparm mol ionized.parm7 ionized.rst7 quit EOF vmd -dispdev text -e tclfix tleap -f leapin rm -f leapin tclfix mol.pdb cd .. done
Prepare the collective variable (e.g. RMSD)
for ii in $(seq 3 32) do cd $ii [ ! -f ionized.parm7 ] && echo "ionized.parm7 does not exist!" && exit 1 [ ! -f ionized.rst7 ] && echo "ionized.rst7 does not exist!" && exit 1 mv ionized.parm7 ionized.parm7.$ii mv ionized.rst7 ionized.rst7.$ii cat > tcl << 'EOF' mol new ionized.parm7.NN type parm7 mol addfile ionized.rst7.NN type rst7 set outf [open "cv.in" "w"] puts $outf "&colvar cv_type='MULTI_RMSD'," set sel [atomselect top "protein and resid 892 to 1078 1254 to 1300 and name CA"]. # Define the residues you wanna restrain the RMSD puts $outf "cv_ni=[$sel num]," puts $outf "cv_nr=[expr [$sel num] * 3]," puts -nonewline $outf "cv_i=" foreach ii [$sel get index] { set aa [atomselect top "index $ii"] puts -nonewline $outf "[$aa get serial]," } puts -nonewline $outf "\ncv_r=" foreach ii [$sel get index] { set aa [atomselect top "index $ii"] puts $outf "[$aa get x], [$aa get y], [$aa get z]," } puts $outf "\nanchor_position=-3,0,0,3,\nanchor_strength=5,5, /" # Define the strength in kcal/mol/A2 close $outf quit EOF sed "s/NN/$ii/g" tcl > tcl2 vmd -dispdev text -e tcl2 rm -f tcl* cd .. done
PMF over distance
Pull to create windows
You may pull using your preferred MD engines, e.g. NAMD SMD/Targeted MD or Gromacs Essential Dynamics. Created a series of PDB files (removing hydrogens) and then
$ cat step1.sh jj=0 for ii in $(seq 32.0 1.5 60.5) do ipdb=$(($jj * 7)) echo $jj $ii $ipdb rm -rf $ii mkdir -p $ii cp ../pulling/${ipdb}.pdb $ii/mol.pdb cd $ii; vmd -dispdev text mol.pdb -e ../fix.tcl; tleap -f ../build.leap; cd .. jj=$(($jj+1)) done
Minimization, annealing and equalibration with CA restrained
Double check your COM_dist.RST
$ cat COM_dist.RST &rst iat=-1,-1, r1=-199.0, r2=DISTHERE, r3=DISTHERE, r4=199.0, rk2=2.5, rk3=2.5, iresid=0, fxyz=1,1,1, outxyz=1, igr1=1,2,3, igr2=4,5,6, /
Then simply start parallel simulations of the windows
$ cat step2.sh jj=0 for ii in 50.0 48.5 47.0 45.5 do export CUDA_VISIBLE_DEVICES=$jj cp *.in COM_dist.RST $ii cd $ii sed -i "s/DISTHERE/${ii}/g" COM_dist.RST bash ../run.sh & cd .. jj=$((jj+1)) done
Restrain to the window centers before sampling
$ cat 05_Pull.in Pull 303K &cntrl imin=0, ntx=5, irest=1, ntc=2, ntf=2, tol=0.0000001, nstlim=50000, ntt=3, gamma_ln=1.0, temp0=303.0, ntpr=1000, ntwr=500000, ntwx=1000, dt=0.002, ig=-1, ntb=2, cut=10.0, ioutfm=1, ntxo=2, nmropt=1, ntp=1, pres0=1.0, comp=44.6, taup=2.0, / &wt type='DUMPFREQ', istep1=10 / &wt type='END', / DISANG=COM_dist.RST DUMPAVE=05_Pull_dist.dat LISTIN=POUT LISTOUT=POUT / / &ewald skinnb=3.0, /
Sampling
Same as 05_Pull.in except running longer.
Analysis using WHAM
Install WHAM from here.
First you will have to prepare data files from AMBER outputs
$ cat run_prepare.sh for ii in 38.0 51.5 53.0 do awk '{print $1,"",$8}' ../$ii/06_Prod_dist.dat > ../$ii/dist.dat done
Then calculate PMF using WHAM
$ cat run_pmf.sh wham 0 32 320 0.00000001 303 0 metadata out.pmf sed '1d' out.pmf | awk '{print $1,"",$2}' > plot_free_energy.dat
metadata looks like this
../60.5/dist.dat 60.5 5.0 ../59.0/dist.dat 59.0 5.0 ../57.5/dist.dat 57.5 5.0
Gromacs
Everyone should look that Justin's tutorial first: [1]
Choosing a reaction coordinate
Pull to create windows
Run MD
MD Inputs
Include this in mdp
; Pull code pull = yes pull_ngroups = 2 pull_ncoords = 1 pull_group1_name = Protein_&_r_918-923 pull_group2_name = Protein_&_r_396-407 pull_coord1_type = umbrella ; harmonic biasing force pull_coord1_geometry = distance ; simple distance increase pull_coord1_groups = 1 2 pull_coord1_dim = N N Y pull_coord1_rate = 0.0 pull_coord1_k = 1000 ; kJ mol^-1 nm^-2 pull_coord1_start = yes ; define initial COM distance > 0
Analyze using WHAM
cat tpr-files.dat '../output/umbrella114.tpr' cat pullf-files.dat 'pullf-umbrella114.xvg' gmx wham -it tpr-files.dat -if pullf-files.dat -o -hist -unit kCal