Molecular docking is a standard computational approach to predict binding modes of protein–ligand complexes by exploring alternative orientations and conformations of the ligand (i.e., by exploring ligand flexibility). Docking tools are largely used for virtual screening of small drug-like molecules, but their accuracy and efficiency greatly decays for ligands with more than 10 flexible bonds. This prevents a broader use of these tools to dock larger ligands, such as peptides, which are molecules of growing interest in cancer research. To overcome this limitation, our group has previously proposed a meta-docking strategy, called DINC, to predict binding modes of large ligands. By incrementally docking overlapping fragments of a ligand, DINC allowed predicting binding modes of peptide-based inhibitors of transcription factors involved in cancer. Here, we describe DINC 2.0, a revamped version of the DINC webserver with enhanced capabilities and a more user-friendly interface. DINC 2.0 allows docking ligands that were previously too challenging for DINC, such as peptides with more than 25 flexible bonds. The webserver is freely accessible at http://dinc.kavrakilab.org, together with additional documentation and video tutorials. Our team will provide continuous support for this tool and is working on extending its applicability to other challenging fields, such as personalized immunotherapy against cancer. Cancer Res; 77(21); e55–57. ©2017 AACR.
Given the importance of the structure–function relationship in proteins, knowledge of the structure of a protein or a macromolecular complex can provide insights on how to inhibit a protein target or disrupt a complex, which can be key to preventing a pathologic outcome (1). In this context, computational methods have been widely applied to the virtual screening of small molecules with potential use as inhibitors (or agonists), which constitutes one of the initial steps for drug discovery and rational drug design (2, 3). Molecular docking methods, for instance, can predict both the geometry of a protein–ligand complex and its corresponding binding energy (4). These methods are known to be generally accurate for small drug-like ligands with up to 10 flexible bonds, or degrees of freedom (DoF; ref. 5). In recent years, there has been a growing interest in the use of peptides as protein inhibitors, given their natural role as binders and regulators in a variety of pathways (6, 7). However, even small peptides and peptidomimetics (i.e., synthetic peptide-based inhibitors; ref. 8) are too large and too flexible for most available molecular docking methods (9). Note that peptide docking is a growing field of research and that other tools have been recently proposed (7). However, a more detailed review of these methods goes beyond the scope of this article.
To address the challenge of docking large ligands, our group has previously developed a meta-docking approach called DINC (Docking INCrementally). DINC was evaluated on a dataset of large ligands, by trying to reproduce 73 complexes reported in the Protein Data Bank (PDB; ref. 5). DINC was shown to be much faster than the standard tool AutoDock 4 (AD4; ref. 10) and to provide binding modes that were consistent with experimental data (5). However, this experiment also highlighted the limitations of the original implementation. Indeed, good reproductions were only observed for drug-like ligands with up to 16 DoFs. In another study, our group investigated the use of peptidomimetics as inhibitors of the Src homology 2 (SH2; ref. 11) domain of STAT3, a transcription factor that was found to be constitutively activated in a number of human cancers (8). As no structural information was available on potential SH2 inhibitors, we used DINC to predict the binding modes of a group of previously experimentally identified peptidomimetic inhibitors with up to 22 DoFs. Besides predicting binding modes that were in agreement with data from previous studies, we were able to differentiate between strong and weak binders using predicted free energies and conformational analysis through molecular dynamics simulations. In addition, DINC predicted a yet unknown alternative binding mode to SH2, leading to new opportunities for developing stronger inhibitors (8).
Incremental Docking of Overlapping Fragments
DINC is a parallelized meta-docking method for the incremental docking of large ligands (Fig. 1). Instead of docking the whole ligand by exploring all its DoFs at once, DINC reduces the problem complexity by incrementally docking larger and larger overlapping fragments of the ligand. DINC starts by selecting a small fragment of the ligand and calling a standard molecular docking software (currently AD4; ref. 10) to dock this fragment in a first round of sampling and scoring. AD4 uses a Lamarckian genetic algorithm for sampling, and a semi-empirical free energy force-field for scoring, based on pair-wise evaluations (V) with predefined weights (Wi; ref. 10):
The best binding modes obtained in the first round of docking are then selected, and the corresponding fragment is expanded by adding atoms of the ligand to it. Then, the next round of docking is performed. This process of expanding and docking is repeated incrementally, until the whole ligand is reconstructed and docked (Fig. 1). By default, the initial fragment is defined to have only 6 DoFs. At each docking round, 3 new DoFs are added and only 3 of the previously explored DoFs are kept flexible. This way, each docking round considers only 6 internal DoFs of the ligand (in addition to its rotation and translation), regardless of fragment size.
Note that, in the context of DINC, the meaning of “fragment” differs from its most common use in fragment-based drug discovery (FBDD). FBDD involves creating new drugs or drug-like ligands using libraries of fragments, which are very small molecules with no more than two functional groups (12). A related strategy, anchor-and-grow, was implemented in the docking software DOCK (13). DINC was inspired by these developments, but provides a higher level method that leverages and enhances traditional docking software (such as AD4; ref. 10), therefore improving efficiency for large ligands.
A New User-Friendly Webserver for Docking Peptides
Aiming to dock even larger ligands, such as peptides, we upgraded the DINC algorithm and created a whole new webserver, DINC 2.0. New heuristics were introduced to select the initial fragment and to expand the subsequent fragments, based on maximizing the potential for hydrogen bonds with the receptor. DINC 2.0 also features an improved management of the parallelization process, which is crucial to docking large peptides; users can increase the number of top conformations selected at each round, therefore expanding the sampling without increasing the running time. Finally, our team improved several technical components of the original software; we are continuously updating the algorithm to improve its efficiency and to allow for the processing of very challenging complexes.
The main page of DINC 2.0 is clear and user friendly, making docking accessible to both new and experienced users. To run a docking job, users have to provide the structures of the ligand and receptor of interest. Although the original DINC (5) required specific preprocessing of the ligands, DINC 2.0 accepts standard PDB files as input. Then, users have to provide information on the “grid box,” a volume that should be large enough to contain the ligand and the receptor's binding cleft. The docking search is based on the position of this box and the points of this three-dimensional grid are used for the energy precomputation (scoring).
In a redocking experiment (i.e., when trying to reproduce a known protein–ligand complex), users can easily define the grid box location and size based on the input ligand (see Supplementary Video S1). In other docking experiments, when the ligand coordinates do not correspond to the receptor's binding site, users must provide specific values for “grid center” and “grid size” (see Supplementary Video S2). To find appropriate values, users can use various graphical tools (see our “HELP” page). Users can also copy the values from previous experiments with AD4 (10) or AutoDock Vina (see Supplementary Video S2; ref. 14). The interface automatically converts all values to angstroms, facilitating the comparison to grid boxes defined in other docking software.
Additional parameters that directly impact the search can be accessed in the “Advanced options” menu. Increasing the values of these parameters can improve the likelihood of finding accurate binding modes, but can increase running time. Further information can be found in the “HELP” page, which also contains video tutorials, links to related resources and tools, as well as examples of the expected output files for three large ligands; one of them is an 8-mer peptide with 26 DoFs, involved in a complex that DINC 2.0 reproduced with an accuracy of 1.61 Å.
Finally, another new feature in DINC 2.0 is a “Results” page that summarizes the input parameters and provides the best alternative binding modes, selected on the basis of binding energy ranking or RMSD clustering. Using the embedded visualization tool, users can quickly inspect the output and compare the predicted binding modes, which can also be downloaded for further analysis (see Supplementary Video S1). We encourage users to further evaluate the complexes produced by DINC 2.0 using other methods, such as molecular dynamics, especially when predicting complexes for which no crystal structure is available.
Future Directions and Challenges in Cancer Immunotherapy
Following a meta-docking approach, we envision to extend our methods to other popular docking tools to investigate the benefits of alternative scoring functions and consensus docking. Regarding future applications, one of the greatest challenges in peptide docking relates to the structural prediction of human leukocyte antigen (HLA) complexes (5). HLAs are key molecules for the cellular immune response, because they can bind peptides derived from intracellular proteins and display them at the cell surface for recognition by circulating cytotoxic T cells (15). Given their role in the immunity against tumors (15), predicting the structure of such peptide–HLA complexes is becoming a major need for T-cell–based immunotherapy. The challenge, however, is the size of these peptides, which are typically 8 to 12 amino acids long and present more than 35 DoFs. Our group is now focused on further expanding the capabilities of DINC 2.0, specifically aiming at applications toward personalized T-cell–based immunotherapy against cancer.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Conception and design: D.A. Antunes, L.E. Kavraki
Development of methodology: D.A. Antunes, D. Devaurs, L.E. Kavraki
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): D.A. Antunes, K.R. Jackson
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): D.A. Antunes, L.E. Kavraki
Writing, review, and/or revision of the manuscript: D.A. Antunes, M. Moll, D. Devaurs, G. Lizée, L.E. Kavraki
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M. Moll
Study supervision: G. Lizée, L.E. Kavraki
We thank Sujay Tadwalkar, Angela Hoch, and Sarah Hall-Swan for their contributions to this project.
Work on this project by D.A. Antunes, L.E. Kavraki, M. Moll, G. Lizée, and K.R. Jackson was supported by NIH R21CA209941-01 (co-PIs: L.E. Kavraki and G. Lizée) through the Informatics Technology for Cancer Research (ITCR) initiative of the NCI. Work on this project by D. Devaurs was supported by a training fellowship from the Gulf Coast Consortia, on the Computational Cancer Biology Training Program (CPRIT grant no. RP170593, PI: M. Pettitt).