How to do dockings

From ChengLab
Revision as of 23:29, 27 August 2020 by Kevin (talk | contribs) (Prepare receptor)
Jump to: navigation, search

Prerequisite

Prepare receptor

ADFSuite (Download here)

Install

./install.sh -d ~/opt/ADFRsuite-1.0 -c 0

Prepare ligands

Openbabel

Benchmark

Autodock 4: 50 LGA, 2,500,000 evaluation each

Autodock Vina: max. mode=50

Resolution=0.375 A

Box-pointse=80x80x80

Box-size=30x30x30 (A^3)

(compounds/day) OSC (30 nodes) Amber machine (1 node, 4 GPUs)
Autodock 4 79,120 (Potentially 1.3x)
Vina 300,000 13,186

It seems that Vina can be accelerated by ~50% if you compile it well (Read this), but still lagging behind Autodock-GPU (Git here) on a multi-GPU machine.

Libraries

ZINC

ChemBridge

Run Glide

Ligprep

ligprep -isd SP_input_cluster1_0.sdf -omae ligprep_cluster1_0-out.maegz -bff 14 -i 2 -ph 7.4 -pht 0.0 -ac -s 5

Create a grid box

Feed this to glide

GLIDECONS YES
GLIDE_NTOTALCONS 1
GLIDE_NUMPOSITCONS 1
GRIDLIG NO
USECOMPMAE YES
INNERBOX 10, 10, 10
ACTXRANGE 26.407652
ACTYRANGE 26.407652
ACTZRANGE 26.407652
GRID_CENTER 6.928290, 6.346953, 21.799812
OUTERBOX 26.407652, 26.407652, 26.407652
GRIDFILE "factorXa_grid.zip"
RECEP_FILE "factorXa_grid.maegz"
GLIDECONSNAMES "position1", "S1_site"
GLIDEXVOLNAMES "volume1"

Docking

glide -WAIT -HOST ${host}:${nproc} SP_cluster1_0.in
$ cat SP_cluster1_0.in
GRIDFILE /users/PAS1146/osu0316/full_nci_database/cluster1/cluster1.zip
DOCKING_METHOD confgen
PRECISION SP
LIGANDFILE /users/PAS1146/osu0316/full_nci_database/cluster1/SP/ligprep_cluster1_0-out.maegz
OUTPUTDIR /users/PAS1146/osu0316/full_nci_database/cluster1/SP/
WRITEREPT True

PBS script

#PBS -A PAA0030
#PBS -N SP_cluster1_0
#PBS -j oe
#PBS -l walltime=100:00:00
#PBS -l nodes=1:ppn=1
#PBS -S /bin/sh
#PBS -l software=glide+1

qstat -f $PBS_JOBID
module load schrodinger
cd $PBS_O_WORKDIR
host=`cat $PBS_NODEFILE|head -1`
nproc=`cat $PBS_NODEFILE|wc -l`
glide -WAIT -HOST ${host}:${nproc} SP_cluster1_0.in

Check Running Glide

If you wanna check who is running glide (as OSC only bought 10 licenses):

licadmin STAT

Therefore you may NOT be able to run glide at the OSC master node (even for testing). Also, remember to add a PBS option when you submit glide to avoid job cancellation due to too many people running glide

#PBS -l software=glide+1

Run Autodock

Compilation

Manual here

make DEVICE=GPU NUMWI=128

Run Autodock Vina

Prepare all ligands (.pdbqt) in a directory ligand/pdbqt.

Prepare all receptors (.pdbqt) in a directory receptor.

Split ligands by

python2 split.py

The configuration file looks like this:

center_x = 77.009
center_y = 80.221
center_z = 60.623
size_x = 28
size_y = 28
size_z = 26
num_modes = 100

Single receptor

You may find the template vina run script here.

Multiple receptors

You may find the template vina run script (multiple) here.

Define path to receptors (RECP), configuration file (CONF), and output directory (OUTPUT).

The basic run command is

vina --config <config> --receptor <pdbqt> --ligand <pdbqt> --out out.pdbqt --log log.txt

You may find the template job script here.

Prepare (and submit) the vina run scripts by

sh make_vina_rec.sh

Get results by

sh get_result_multiple.sh

Get results using python

#! /usr/bin/env python

import sys
import glob

def doit(n):
   file_names = glob.glob('*/*.pdbqt')
   everything = []
   failures = []
   print 'Found', len(file_names), 'pdbqt files'
   for file_name in file_names:
       file = open(file_name)
       lines = file.readlines()
       file.close()
       try:
           line = lines[1]
           result = float(line.split(':')[1].split()[0])
           everything.append([result, file_name])
       except:
           failures.append(file_name)
   everything.sort(lambda x,y: cmp(x[0], y[0]))
   part = everything[:n]
   for p in part:
       print p[1],
   print
   if len(failures) > 0:
       print 'WARNING:', len(failures), 'pdbqt files could not be processed'

if __name__ == '__main__':
   doit(int(sys.argv[1]))

Move selected directories

Make a namelist

head -25 sort | awk '{print $1}' | cut -d/ -f1-5 > tmp

Move the directories

for ii in $(cat tmp); do cp -r $ii res; done

Looking at the results

Get binding poses

A typical Autodock output looks like this:

MODEL      801
USER    Run = 801
USER    Cluster Rank = 1
USER    Number of conformations in this cluster = 22
USER  
USER    RMSD from reference structure       = 58.478 A
USER  
USER    Estimated Free Energy of Binding    =  -11.77 kcal/mol  [=(1)+(2)+(3)-(4)]
USER    Estimated Inhibition Constant, Ki   =    2.38 nM (nanomolar)  [Temperature = 298.15 K]
USER    
USER    (1) Final Intermolecular Energy     =  -13.85 kcal/mol
USER        vdW + Hbond + desolv Energy     =  -13.61 kcal/mol
USER        Electrostatic Energy            =   -0.24 kcal/mol
USER    (2) Final Total Internal Energy     =   -2.04 kcal/mol
USER    (3) Torsional Free Energy           =   +2.09 kcal/mol
USER    (4) Unbound System's Energy  [=(2)] =   -2.04 kcal/mol
USER    
USER    
USER  
USER    DPF = WD40.dpf
USER    NEWDPF move	teg.pdbqt
USER    NEWDPF about	3.758800 0.183900 -6.839200
USER    NEWDPF tran0	9.906563 33.794136 47.044031
USER    NEWDPF axisangle0	-0.222817 -0.959206 0.174002 107.688080
USER    NEWDPF quaternion0	-0.179906 -0.774476 0.140492 0.589985
USER    NEWDPF dihe0	-162.13 -159.47 -121.18 -26.65 178.53 33.91 99.44 
USER  
USER                              x       y       z    vdW   Elec        q     RMS 
ATOM      1  C15 RMT     0      13.047  32.564  44.515 -0.28 -0.00    +0.075    58.478

You will want to fetch the lines start with "ATOM" and remove the last two columns to have:

ATOM      1  C15 RMT     0      13.047  32.564  44.515 -0.28 -0.00
.
.
.
TER

Remember to add "TER" after the coordinates.

awk '{print gensub (/[^[:blank:]]+/, " ", 11)}'

is useful to remove unwanted columns (the 11th) from coordinates produced by docking softwares.

adt_split_cluster.sh can be used to split the results from clustering Autodocks.

Combine the ligand with the receptor (proteins)

Make a namelist

find res -type d | tail -n +2 > namelist

Combine ligands and receptors

sh make_combine.sh

Now you will have a combined.pdb to look at.

Rendering

Here is a sample vmd visualization file for TBL1: render.vmd

The render command is

sh make_render_vmd.sh

You may use

sh make_render.sh

or just uncomment lines in make_combine.sh

Gather the pictures

sh cp_jpg.sh

Contact Statistics

Scripts can be found here

General steps are:

  1. Get ready with PDBs combining the ligands and the receptor
  2. make_res.sh: using VMD to print a list of residue ID which is within 3 A of the ligand
  3. make_reslist.py: turn the ResID list into histogram
  4. Plot with plot_bar.m using MATLAB