The UK-QSAR and Cheminformatics group hold two meetings annually at different venues in the UK. This group has been running since the first European QSAR meeting in Yugoslavia in 1986, and since then a host of scientists meet to discuss topical discoveries relevant across a wide range of academic and industrial sectors.
Recently, the Autumn group meeting was held in the Francis Crick Institute, for which there were over 100 in-person attendees, with more attending virtually. The morning session focused on recent developments in chemical space, whilst the afternoon highlighted how differing technologies can analyse molecules for future medicines.
Amelie Heimann, MSD, UK-QSAR attendee
*****
See the PeerJ Physical Chemistry Special Issue:
AI-driven chemistry for drug design
*****
Morgan Thomas Ph.D candidate (AI in Drug Discovery), University of Cambridge, UK.
Can you tell us a bit about yourself and your research interests?
After studying an MChem Pharmaceutical Chemistry at the University of Leicester I was kindly offered a place on the highly sought after AstraZeneca R&D Graduate Programme. This experience exposed me to the multiple facets of drug design (small-molecule design being my chief interest) where I gained valuable experience in Structural and Analytical Chemistry, Computational Chemistry and Bioinformatics. It was during my Computational Chemistry rotation where I was made aware of the AI endeavours in property prediction, de novo design and synthesis planning etc., mostly driven by the onset of deep learning. Since, I have been interested in how we can leverage AI methods to speed up and improve the therapeutic design process – if implemented and interpreted correctly. After completing the two-year AstraZeneca Graduate Programme, I was fortunate enough to be offered PhD funding by Sosei Heptares to conduct a collaborative PhD in AI in Drug Discovery with Prof. Andreas Bender at the University of Cambridge. This has led me to focus on the implementation, improvement and critical assessment of generative models for de novo design during my PhD, including: the effect of scoring functions on goal-directed generative models, improvements in optimization efficiency, design of a python platform to facilitate evaluation and implementation to support idea generation for drug discovery projects.
What first interested you in this field of research?
During my AstraZeneca Computation Chemistry rotation I was given the opportunity to prototype the REINVENT generative model platform on an Oncology project. Naturally critical, I was surprised to realise the potential of generative models to produce medicinally relevant chemistry. It was this realisation that led me to pursue the AI aspect of drug design.
Can you briefly explain the research you presented at the UKQSAR meeting?
The work I presented at the UKQSAR meeting was on the use of Augmented Hill-Climb in application to identifying novel A2a antagonists using structure-based approaches by combining a generative model with docking. Augmented Hill-Climb is a simple adaptation to the REINVENT reinforcement learning strategy to improve learning efficiency all while the de novo molecules contain similar chemical properties (too often compromised in the same endeavour). Therefore, fewer de novo molecules need to be sampled and evaluated by scoring functions for the generative model to adapt to an objective, which enables more practical use of medium-time scoring functions such as docking or computer-aided synthesis planning. We showed a comparison of Augmented Hill-Climb to REINVENT, other common RL algorithms and 25 other generative models recently benchmarked on learning efficiency; Augmented Hill-Climb outperformed all other methods when jointly considering the type of chemistry generated. The end result is a generative model able to optimize the docking score overnight on ~10 CPUs compared to a week on ~30-40 CPUs previously. This was utilized to optimize the docking score against several different Sosei Heptares A2a Star co-crystal structures and assess the type of de novo molecules generated when different constraints were used in the objective. The model recovered many known A2a chemotypes as well as novel chemotypes. We triaged de novo molecules down to 427 molecules of interest, 81% of which were not found in vendor libraries, and 71% were predicted to be synthesizable. We have now filtered these down to 41 molecules of interest that we hope to prospectively validate, apart from three that are already known sub-micromolar binders.
What are your next steps? How will you continue to build on this research?
Our next steps in this particular research is to synthesize and validate as many de novo molecules as possible and try to identify novel chemistry active against this well-liganded target, a real challenge. We hope this may contribute to the validation of such generative model approaches moving forward. Moreover, this technology will have a bigger impact in supporting drug discovery projects in-house, where challenging targets have already been identified for its use. More broadly, I will focus on other questions surrounding the use of generative models, including how to effectively evaluate them and the effect of different scoring function parameters on de novo molecules.