Challenge #6

Hit Identification
Method type (check all that applies)
High-throughput docking
Physics-based
Hybrid of the above
Physics-based docking method used
Description of your approach (min 200 and max 800 words)

We will use structure-based ultra-large virtual screenings using VirtualFlow 2.0 [Gorgulla 2023]. The procedure will consist of four steps. 

Step 1: Protein preparation. Protein structures for the TTP domain of SETDB1 will be obtained from the Protein Data Bank (PDB codes 7CJT, 8UWP, 6AU3), with the co-crystalized ligands in the complex structures removed. The protein structures will be prepared with Maestro from Schrödinger (protonation state assignment, assignment of missing atoms/side chains, hydrogen atoms, ...).

Step 2: Hit identification. The hit identification step will consist of two virtual screening stages.

Step 2a: Primary virtual screen (stage 1). We will use structure-based ultra-large virtual screenings using physics-based docking methods (e.g. QuickVina 2). We will screen a ligand library with 69 billion molecules with VirtuaFlow 2.0, an open-source platform for ultra-large virtual screens. The library we are using is the Enamine REAL Space (version 2022q12). We will use a new adaptive screening technique that we have developed, called Adaptive Target Guided Virtual Screens (ATG-VS). Due to the large-scale computations required for this approach, we will use the AWS Cloud, which is supported by VirtualFlow 2.0. We have extensive experience using the cloud and have used over 5 million CPUs in parallel in the past [Gorgulla 2023]. The protein will be held rigid in stage 1 of the screens. The ligand library that we will be using (Enamine REAL Space) has already been prepared by us into a ready-to-dock format [Gorgulla 2023]. The ligands have been protonated, and tautomerized, the 3D conformation has been computed, and the ligands are in the ready-to-dock PDBQT format. 

Step 2b: Rescoring (stage 2). We will rescreen the top 1 million compounds of stage 1 in stage 2, and will allow the protein side chains at the binding site to be flexible. GWO Vina will be used for the flexible dockings.

Step 3: Postprocessing of the results. The screened compounds of Step 2 will be ranked by their docking score. Of the top 1000 compounds, biophysical and pharmacokinetic properties will be computed, visual inspection carried out, MM/GBSA binding energy calculated, all of which will be taken into account during the selection. Compounds with unfavorable properties (e.g. too high logP or PAINS motiv) will be filtered out. To ensure the novelty of the compounds, compounds with a similar scaffold as the ligands co-crystalized with the TTD of SETDB1 (PDB codes 7CJT, 8UWP, 6AU3) will be removed. 

 

What makes your approach stand out from the community? (<100 words)

The ultra-large virtual screens that we do are of the largest scales reported to date. In 2020 we reported one of the first screens with over 1 billion compounds. In this work, we plan to screen 69 billion compounds, the largest ready-to-dock library available to date (Gorgulla 2023). The scale of the library matters, because the number of compounds screened directly correlates with the potency and the true hit rate observed during experimental validation (Gorgulla 2020, Lyu 2019, Alon 2021).

Method Name
VirtualFlow 2
Commercial software packages used

Maestro (protein preparation)

Free software packages used

VirtualFlow 2, AutoDock Vina, QuickVina, GWOVina

Relevant publications of previous uses by your group of this software/method

Gorgulla, Christoph, et al. "An open-source drug discovery platform enables ultra-large virtual screens." Nature 580.7805 (2020): 663-668. https://www.nature.com/articles/s41586-020-2117-z

Gorgulla, Christoph, et al. "A multi-pronged approach targeting SARS-CoV-2 proteins using ultra-large virtual screening." Iscience 24.2 (2021): 102021. https://www.sciencedirect.com/science/article/pii/S2589004220312189

Gorgulla, Christoph, et al. "Accounting of receptor flexibility in ultra-large virtual screens with VirtualFlow using a grey wolf optimization method." Supercomputing frontiers and innovations 7.3 (2020): 4. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8530406/

Gorgulla, Christoph, et al. "VirtualFlow Ants—Ultra-Large Virtual Screenings with Artificial Intelligence Driven Docking Algorithm Based on Ant Colony Optimization." International Journal of Molecular Sciences 22.11 (2021): 5807. https://www.mdpi.com/1422-0067/22/11/5807

Gorgulla, Christoph, et al. "VirtualFlow 2.0-The Next Generation Drug Discovery Platform Enabling Adaptive Screens of 69 Billion Molecules." bioRxiv (2023). https://www.biorxiv.org/content/10.1101/2023.04.25.537981v1.abstract