Computational methods

Hit Identification

Method Name

RosettaVS

Description of your approach (min 200 and max 800 words)

Our approach is a combination of active-learning techniques and a state-of-the-art physics-based virtual screening method to screen ultra-large chemical compound libraries for hit discovery. Concretely, we will use the Virtual Screening Express (VSX) mode in RosettaVS and the OpenVS platform. The aim is to screen either the Enamine REAL library (~4 billion compounds) or the ZINC22 library (~4 billion compounds) against multiple conformations of the target structure.

Our approach uses active learning techniques to effectively explore the chemical space without docking each individual compound in the ultra-large chemical library. Around ten iterations of docking will be performed. During each iteration, half a million compounds will be docked and a surrogate model will be trained using the predicted binding affinities from the ligand docking.

The surrogate model will be used to infer the binding affinity on the entire library to select another half a million compounds for the next iteration of docking. The iterative process will be terminated when the predicted binding affinities of the top-ranked compounds converge or the pre-specified maximum iterations (usually ten iterations) have been reached.

A flexible docking protocol in RosettaVS will be employed to re-dock the top-ranked compounds from the initial screen to account for the flexibility of the pocket. Finally, a set of filters, such as the number of unsatisfied hydrogen bonds and the number of torsion angle outliers, will be used to select the final compounds for experimental validation.

What makes your approach stand out from the community? (<100 words)

Our approach employs a state-of-the-art physics-based virtual screening method for the prediction of ligand pose and binding affinity. When combined with active learning techniques, our method can effectively screen multi-billion chemical compound libraries for hit discovery. Additionally, our approach is capable of modeling the flexibility of the binding pocket.

Free software packages used

Rosetta software suite (free for academic and non-commercial purposes), OpenVS, CSD, RDKit, Openbabel, dimorphite_dl

Challenge #6