How to do dockings
Contents
Prerequisite
Prepare receptor
You can use ADFSuite (Download here) or raccoon. In general you have to convert PDB to PDBQT which contains partial charge info.
Checklist
- Open your protein PDB in Chimera/VMD/PyMOL and LOOK AT IT
- Any missing residues
- Any unwanted ligands
- Add hydrogens (using Chimera/Gromacs/MGLTools)
Install
./install.sh -d ~/opt/ADFRsuite-1.0 -c 0
Usage
prepare_receptor -v -r <Input PDB> -o receptor.pdbqt
Prepare ligands
You can use ADFSuite (Download here) or Openbabel.
Usage
prepare_ligand -v -l <Input PDB>
Benchmark
Autodock 4: 50 LGA, 2,500,000 evaluation each
Autodock Vina: max. mode=50
Resolution=0.375 A
Box-pointse=80x80x80
Box-size=30x30x30 (A^3)
(compounds/day) | OSC (30 nodes) | Amber machine (1 node, 4 GPUs) |
---|---|---|
Autodock 4 | 79,120 (Potentially 1.3x) | |
Vina | 300,000 | 13,186 |
It seems that Vina can be accelerated by ~50% if you compile it well (Read this), but still lagging behind Autodock-GPU (Git here) on a multi-GPU machine.
Libraries
Glide
Ligprep
$SCHRODINGER/ligprep -WAIT -HOST ${host}:${nproc} -isd combine.sdf -omae out.maegz -bff 14 -i 2 -W i,-ph,7.0,-pht,0.0 -ac -s 10
Create a grid box
$SCHRODINGER/utilities/prepwizard recpt.pdb recpt.maegz
Feed this to glide
USECOMPMAE YES INNERBOX 15, 15, 15 ACTXRANGE 26.407652 ACTYRANGE 26.407652 ACTZRANGE 26.407652 GRID_CENTER 192, 202, 216 OUTERBOX 26.407652, 26.407652, 26.407652 GRIDFILE "grid.zip" RECEP_FILE "recpt.maegz"
The gridprep command
$SCHRODINGER/glide -WAIT -HOST ${host}:${nproc} gridprep.in
Docking
#PBS -A PAA0203 #PBS -N SDGR-dock #PBS -j oe #PBS -m ae #PBS -M chan.773@osu.edu #PBS -l walltime=1:00:00 #PBS -l mem=4315MB #PBS -l software=glide+1 #PBS -S /bin/sh qstat -f $PBS_JOBID module load schrodinger echo "Success loading schrodinger" cd $PBS_O_WORKDIR echo -e "Working in $(pwd)" host=`cat $PBS_NODEFILE|head -1` nproc=`cat $PBS_NODEFILE|wc -l` $SCHRODINGER/glide -WAIT -HOST ${host}:${nproc} docking.in date tmp=$(date +%Y%m%d%H%M%S) mkdir -p $tmp cd $tmp $SCHRODINGER/run split_structure.py ../docking_pv.maegz out.pdb -m pdb -many_files for ii in out_ligand*.pdb; do prefix=$(basename $ii .pdb) $SCHRODINGER/utilities/structconvert -ipdb $ii -omol $prefix.mol2 done
$ cat docking.in GRIDFILE grid.zip DOCKING_METHOD confgen PRECISION XP LIGANDFILE lig.maegz WRITEREPT True
Check Running Glide
If you wanna check who is running glide (as OSC only bought 10 licenses):
licadmin STAT
Therefore you may NOT be able to run glide at the OSC master node (even for testing). Also, remember to add a PBS option when you submit glide to avoid job cancellation due to too many people running glide
#PBS -l software=glide+1
Autodock
Compilation
Manual here
make DEVICE=GPU NUMWI=128
Get binding poses
A typical Autodock output looks like this:
MODEL 801 USER Run = 801 USER Cluster Rank = 1 USER Number of conformations in this cluster = 22 USER USER RMSD from reference structure = 58.478 A USER USER Estimated Free Energy of Binding = -11.77 kcal/mol [=(1)+(2)+(3)-(4)] USER Estimated Inhibition Constant, Ki = 2.38 nM (nanomolar) [Temperature = 298.15 K] USER USER (1) Final Intermolecular Energy = -13.85 kcal/mol USER vdW + Hbond + desolv Energy = -13.61 kcal/mol USER Electrostatic Energy = -0.24 kcal/mol USER (2) Final Total Internal Energy = -2.04 kcal/mol USER (3) Torsional Free Energy = +2.09 kcal/mol USER (4) Unbound System's Energy [=(2)] = -2.04 kcal/mol USER USER USER USER DPF = WD40.dpf USER NEWDPF move teg.pdbqt USER NEWDPF about 3.758800 0.183900 -6.839200 USER NEWDPF tran0 9.906563 33.794136 47.044031 USER NEWDPF axisangle0 -0.222817 -0.959206 0.174002 107.688080 USER NEWDPF quaternion0 -0.179906 -0.774476 0.140492 0.589985 USER NEWDPF dihe0 -162.13 -159.47 -121.18 -26.65 178.53 33.91 99.44 USER USER x y z vdW Elec q RMS ATOM 1 C15 RMT 0 13.047 32.564 44.515 -0.28 -0.00 +0.075 58.478
You will want to fetch the lines start with "ATOM" and remove the last two columns to have:
ATOM 1 C15 RMT 0 13.047 32.564 44.515 -0.28 -0.00 . . . TER
Remember to add "TER" after the coordinates.
awk '{print gensub (/[^[:blank:]]+/, " ", 11)}'
is useful to remove unwanted columns (the 11th) from coordinates produced by docking softwares.
Autodock Vina
Prepare all ligands (.pdbqt) in a directory ligand/pdbqt.
Prepare all receptors (.pdbqt) in a directory receptor.
Split ligands by
python2 split.py
The configuration file looks like this:
center_x = 77.009 center_y = 80.221 center_z = 60.623 size_x = 28 size_y = 28 size_z = 26 num_modes = 100
Usage
The basic run command is
vina --config <config> --receptor <pdbqt> --ligand <pdbqt> --out out.pdbqt --log log.txt
Get results using python
#! /usr/bin/env python import sys import glob def doit(n): file_names = glob.glob('*/*.pdbqt') everything = [] failures = [] print 'Found', len(file_names), 'pdbqt files' for file_name in file_names: file = open(file_name) lines = file.readlines() file.close() try: line = lines[1] result = float(line.split(':')[1].split()[0]) everything.append([result, file_name]) except: failures.append(file_name) everything.sort(lambda x,y: cmp(x[0], y[0])) part = everything[:n] for p in part: print p[1], print if len(failures) > 0: print 'WARNING:', len(failures), 'pdbqt files could not be processed' if __name__ == '__main__': doit(int(sys.argv[1]))
Combine the ligand with the receptor (proteins)
- Make a namelist
- grep receptor coordinates from receptor.pdbqt
- grep ligand coordinates from out.dat
- cat together
Rendering
It is assumed that the receptors are pre-aligned.
- Prepare a sample VMD visualisation state
- rename the ligand to LIG
- standard render
Contact Statistics
General steps are:
- Get ready with PDBs combining the ligands and the receptor
- make_res.sh: using VMD to print a list of residue ID which is within 3 A of the ligand
- make_reslist.py: turn the ResID list into histogram
- Plot with plot_bar.m using MATLAB
Select out the top-ranked ligands
Make a namelist
head -25 sort | awk '{print $1}' | cut -d/ -f1-5 > tmp
Move the directories
for ii in $(cat tmp); do cp -r $ii res; done