How to do umbrella sampling

From ChengLab
Revision as of 02:24, 10 March 2020 by Kevin (talk | contribs) (Amber)
Jump to: navigation, search

Amber

Tutorial here Git here

Pull to create windows

You may pull using your preferred MD engines, e.g. NAMD SMD/Targeted MD or Gromacs Essential Dynamics. Created a series of PDB files (removing hydrogens) and then

$ cat step1.sh
jj=0
for ii in $(seq 32.0 1.5 60.5)
do
   ipdb=$(($jj * 7))
   echo $jj $ii $ipdb
   rm -rf $ii
   mkdir -p $ii 
   cp ../pulling/${ipdb}.pdb $ii/mol.pdb
   cd $ii; vmd -dispdev text mol.pdb -e ../fix.tcl; tleap -f ../build.leap; cd ..
   jj=$(($jj+1))
done

Minimization, annealing and equalibration with CA restrained

Double check your COM_dist.RST

$ cat COM_dist.RST
&rst
iat=-1,-1,
r1=-199.0,
r2=DISTHERE,
r3=DISTHERE,
r4=199.0,
rk2=2.5,
rk3=2.5,
iresid=0,
fxyz=1,1,1,
outxyz=1,
igr1=1,2,3,
igr2=4,5,6,
/

Then simply start parallel simulations of the windows

$ cat step2.sh
jj=0
for ii in 50.0 48.5 47.0 45.5
do
   export CUDA_VISIBLE_DEVICES=$jj
   cp *.in COM_dist.RST $ii
   cd $ii
   sed -i "s/DISTHERE/${ii}/g" COM_dist.RST
   bash ../run.sh &
   cd ..
   jj=$((jj+1))
done

Restrain to the window centers before sampling

$ cat 05_Pull.in
Pull 303K
&cntrl
  imin=0, ntx=5, irest=1,
  ntc=2, ntf=2, tol=0.0000001,
  nstlim=50000, ntt=3, gamma_ln=1.0,
  temp0=303.0,
  ntpr=1000, ntwr=500000, ntwx=1000,
  dt=0.002, ig=-1,
  ntb=2, cut=10.0, ioutfm=1, ntxo=2,
  nmropt=1,
  ntp=1, pres0=1.0, comp=44.6, taup=2.0,
/
&wt type='DUMPFREQ', istep1=10 /
&wt type='END', /
DISANG=COM_dist.RST
DUMPAVE=05_Pull_dist.dat
LISTIN=POUT
LISTOUT=POUT
/
/
&ewald
 skinnb=3.0,
/

Sampling

Same as 05_Pull.in except running longer.

Analysis using WHAM

Install WHAM from here.

First you will have to prepare data files from AMBER outputs

$ cat run_prepare.sh
for ii in 38.0 51.5 53.0
do
   awk '{print $1,"",$8}' ../$ii/06_Prod_dist.dat > ../$ii/dist.dat
done

Then calculate PMF using WHAM

$ cat run_pmf.sh
wham 0 32 320 0.00000001 303 0 metadata out.pmf
sed '1d' out.pmf | awk '{print $1,"",$2}' > plot_free_energy.dat

metadata looks like this

../60.5/dist.dat 60.5 5.0
../59.0/dist.dat 59.0 5.0
../57.5/dist.dat 57.5 5.0

Gromacs

Everyone should look that Justin's tutorial first: [1]

Choosing a reaction coordinate

Pull to create windows

Run MD

MD Inputs

Include this in mdp

; Pull code
pull                    = yes
pull_ngroups            = 2
pull_ncoords            = 1
pull_group1_name        = Protein_&_r_918-923
pull_group2_name        = Protein_&_r_396-407
pull_coord1_type        = umbrella      ; harmonic biasing force
pull_coord1_geometry    = distance      ; simple distance increase
pull_coord1_groups      = 1 2
pull_coord1_dim         = N N Y
pull_coord1_rate        = 0.0
pull_coord1_k           = 1000          ; kJ mol^-1 nm^-2
pull_coord1_start       = yes           ; define initial COM distance > 0

Analyze using WHAM

cat tpr-files.dat
'../output/umbrella114.tpr'

cat pullf-files.dat
'pullf-umbrella114.xvg'

gmx wham -it tpr-files.dat -if pullf-files.dat -o -hist -unit kCal