How to do dockings
Contents
Prerequisite
Prepare receptor
ADFSuite (Download here)
Prepare ligands
Openbabel
Benchmark
Autodock 4: 50 LGA, 2,500,000 evaluation each
Autodock Vina: max. mode=50
Resolution=0.375 A
Box-pointse=80x80x80
Box-size=30x30x30 (A^3)
(compounds/day) | OSC (30 nodes) | Amber machine (1 node, 4 GPUs) |
---|---|---|
Autodock 4 | 79,120 (Potentially 1.3x) | |
Vina | 300,000 | 13,186 |
It seems that Vina can be accelerated by ~50% if you compile it well (Read this), but still lagging behind Autodock-GPU (Git here) on a multi-GPU machine.
Libraries
Run Glide
Ligprep
ligprep -isd SP_input_cluster1_0.sdf -omae ligprep_cluster1_0-out.maegz -bff 14 -i 2 -ph 7.4 -pht 0.0 -ac -s 5
Create a grid box
Feed this to glide
GLIDECONS YES GLIDE_NTOTALCONS 1 GLIDE_NUMPOSITCONS 1 GRIDLIG NO USECOMPMAE YES INNERBOX 10, 10, 10 ACTXRANGE 26.407652 ACTYRANGE 26.407652 ACTZRANGE 26.407652 GRID_CENTER 6.928290, 6.346953, 21.799812 OUTERBOX 26.407652, 26.407652, 26.407652 GRIDFILE "factorXa_grid.zip" RECEP_FILE "factorXa_grid.maegz" GLIDECONSNAMES "position1", "S1_site" GLIDEXVOLNAMES "volume1"
Docking
glide -WAIT -HOST ${host}:${nproc} SP_cluster1_0.in
$ cat SP_cluster1_0.in GRIDFILE /users/PAS1146/osu0316/full_nci_database/cluster1/cluster1.zip DOCKING_METHOD confgen PRECISION SP LIGANDFILE /users/PAS1146/osu0316/full_nci_database/cluster1/SP/ligprep_cluster1_0-out.maegz OUTPUTDIR /users/PAS1146/osu0316/full_nci_database/cluster1/SP/ WRITEREPT True
PBS script
#PBS -A PAA0030 #PBS -N SP_cluster1_0 #PBS -j oe #PBS -l walltime=100:00:00 #PBS -l nodes=1:ppn=1 #PBS -S /bin/sh #PBS -l software=glide+1 qstat -f $PBS_JOBID module load schrodinger cd $PBS_O_WORKDIR host=`cat $PBS_NODEFILE|head -1` nproc=`cat $PBS_NODEFILE|wc -l` glide -WAIT -HOST ${host}:${nproc} SP_cluster1_0.in
Check Running Glide
If you wanna check who is running glide (as OSC only bought 10 licenses):
licadmin STAT
Therefore you may NOT be able to run glide at the OSC master node (even for testing). Also, remember to add a PBS option when you submit glide to avoid job cancellation due to too many people running glide
#PBS -l software=glide+1
Run Autodock
Compilation
Manual here
make DEVICE=GPU NUMWI=128
Run Autodock Vina
Prepare all ligands (.pdbqt) in a directory ligand/pdbqt.
Prepare all receptors (.pdbqt) in a directory receptor.
Split ligands by
python2 split.py
The configuration file looks like this:
center_x = 77.009 center_y = 80.221 center_z = 60.623 size_x = 28 size_y = 28 size_z = 26 num_modes = 100
Single receptor
You may find the template vina run script here.
Multiple receptors
You may find the template vina run script (multiple) here.
Define path to receptors (RECP), configuration file (CONF), and output directory (OUTPUT).
The basic run command is
vina --config <config> --receptor <pdbqt> --ligand <pdbqt> --out out.pdbqt --log log.txt
You may find the template job script here.
Prepare (and submit) the vina run scripts by
sh make_vina_rec.sh
Get results by
sh get_result_multiple.sh
Get results using python
#! /usr/bin/env python import sys import glob def doit(n): file_names = glob.glob('*/*.pdbqt') everything = [] failures = [] print 'Found', len(file_names), 'pdbqt files' for file_name in file_names: file = open(file_name) lines = file.readlines() file.close() try: line = lines[1] result = float(line.split(':')[1].split()[0]) everything.append([result, file_name]) except: failures.append(file_name) everything.sort(lambda x,y: cmp(x[0], y[0])) part = everything[:n] for p in part: print p[1], print if len(failures) > 0: print 'WARNING:', len(failures), 'pdbqt files could not be processed' if __name__ == '__main__': doit(int(sys.argv[1]))
Move selected directories
Make a namelist
head -25 sort | awk '{print $1}' | cut -d/ -f1-5 > tmp
Move the directories
for ii in $(cat tmp); do cp -r $ii res; done
Looking at the results
Get binding poses
A typical Autodock output looks like this:
MODEL 801 USER Run = 801 USER Cluster Rank = 1 USER Number of conformations in this cluster = 22 USER USER RMSD from reference structure = 58.478 A USER USER Estimated Free Energy of Binding = -11.77 kcal/mol [=(1)+(2)+(3)-(4)] USER Estimated Inhibition Constant, Ki = 2.38 nM (nanomolar) [Temperature = 298.15 K] USER USER (1) Final Intermolecular Energy = -13.85 kcal/mol USER vdW + Hbond + desolv Energy = -13.61 kcal/mol USER Electrostatic Energy = -0.24 kcal/mol USER (2) Final Total Internal Energy = -2.04 kcal/mol USER (3) Torsional Free Energy = +2.09 kcal/mol USER (4) Unbound System's Energy [=(2)] = -2.04 kcal/mol USER USER USER USER DPF = WD40.dpf USER NEWDPF move teg.pdbqt USER NEWDPF about 3.758800 0.183900 -6.839200 USER NEWDPF tran0 9.906563 33.794136 47.044031 USER NEWDPF axisangle0 -0.222817 -0.959206 0.174002 107.688080 USER NEWDPF quaternion0 -0.179906 -0.774476 0.140492 0.589985 USER NEWDPF dihe0 -162.13 -159.47 -121.18 -26.65 178.53 33.91 99.44 USER USER x y z vdW Elec q RMS ATOM 1 C15 RMT 0 13.047 32.564 44.515 -0.28 -0.00 +0.075 58.478
You will want to fetch the lines start with "ATOM" and remove the last two columns to have:
ATOM 1 C15 RMT 0 13.047 32.564 44.515 -0.28 -0.00 . . . TER
Remember to add "TER" after the coordinates.
awk '{print gensub (/[^[:blank:]]+/, " ", 11)}'
is useful to remove unwanted columns (the 11th) from coordinates produced by docking softwares.
adt_split_cluster.sh can be used to split the results from clustering Autodocks.
Combine the ligand with the receptor (proteins)
Make a namelist
find res -type d | tail -n +2 > namelist
Combine ligands and receptors
sh make_combine.sh
Now you will have a combined.pdb to look at.
Rendering
Here is a sample vmd visualization file for TBL1: render.vmd
The render command is
sh make_render_vmd.sh
You may use
sh make_render.sh
or just uncomment lines in make_combine.sh
Gather the pictures
sh cp_jpg.sh
Contact Statistics
Scripts can be found here
General steps are:
- Get ready with PDBs combining the ligands and the receptor
- make_res.sh: using VMD to print a list of residue ID which is within 3 A of the ligand
- make_reslist.py: turn the ResID list into histogram
- Plot with plot_bar.m using MATLAB