Machine learning improves the discovery of “silver shotgun” drugs 

A novel machine learning approach may improve the efficiency of clinical drug discoveries. Photo credit: Conny Schneider on Unsplash.


The last 30 years have seen the meteoric rise of “omics” technologies in biomedical research. We can now sequence RNA (transcriptomics) and protein (proteomics), and study metabolic profiles (metabolomics) from individual cells in a sample (e.g. from a tumour), capturing the sample’s full cellular diversity in rich detail. Surprisingly, however, this has not translated into effective clinical treatments: clinical trial success rates have remained low (averaging ~10% across different diseases) and stagnant over the last few decades.  

Even after filtering, most drugs raised against a single target are not perfectly specific, and their actions on other targets may lead to unwanted effects.

A main reason for this is the target-based approach of most drug discovery efforts. In a typical target-based campaign, a library of compounds generated against a single molecular target is filtered by successive screening experiments conducted in a model of the disease (such as a cell line derived from patients with the disease). Each screening round weeds out compounds that have undesirable properties, such as low target affinity, toxicity, and major off-target effects. Fewer than 1% of compounds make it past the final pre-clinical round and enter testing in clinical trials, in the hope that one will be a “silver bullet” (effective cure) to treat the disease. There are several limitations to this approach. Even after filtering, most drugs raised against a single target are not perfectly specific, and their actions on other targets may lead to unwanted effects. A more fundamental flaw is the mismatch between target-based discovery and biological reality. Most diseases, particularly complex ones like cancer, have no single causative target, but instead result from the dysregulation of multi-gene networks operating across many different cell types. Moreover, cells can compensate for the effect of a single-target drug by adjusting non-target pathways.  

Rather than selecting compounds that best fit the target, we could select those that lead to a desired change in phenotype, regardless of which target(s) they act on.

Instead, we can take a fundamentally different approach. Rather than selecting compounds that best fit the target, we could select those that lead to a desired change in phenotype, regardless of which target(s) they act on. A phenotype is an observable characteristic of the disease that scientists care about because it indicates the nature, severity of or progression of the disease, and can be replicated and studied in models.. For instance, uncontrolled cell division is a phenotype typical of cancer. We can test a compound library in a cancer cell line and select those that decrease the number of dividing cells (our desired effect), even if we don’t know precisely which target(s) they act on to do this. Thus, we find “silver shotguns” that change disease phenotypes instead of “silver bullets” acting on a single target. Importantly, we can use the rich information from “omics” experiments to find and define new disease phenotypes. This strategy of phenotypic drug discovery is much more efficient at prioritising compounds that are likely to be clinically effective—indeed, about 65% of all approved medicines today were discovered using it.  

Screening of around one million compounds (the average number in most corporate drug libraries) for phenotypic effects is gruelling and prohibitively expensive. Whittling this down to typically less than 10,000 for feasible screening requires an efficient virtual prioritisation strategy. This should ideally be generalisable (so we don’t have to retrain our model for every new phenotype) as well as optimisable (capable of refinement through feedback). Unfortunately, most existing prioritisation algorithms meet only one of these criteria. 

DeMeo et al introduced “DrugReflector”—a deep learning framework that is both generalisable and optimisable.

In a recent study, DeMeo et al introduced “DrugReflector”—a deep learning framework that is both generalisable and optimisable. It is an ensemble of three deep neural networks trained using active reinforcement learning (see diagram below). The model was initially trained using CMap, an open-source dataset of RNA expression changes caused by different “perturbations” (in this case, different chemical compounds) in multiple cell lines. The model learns the associations between user-defined target signatures (gene expression changes, used as a proxy for phenotype) and the perturbation (compound used), and generates a ranked list of compounds likely to produce that signature.  

DrugReflector was then tested using differences in gene set expression between two cellular states (such as “diseased” and “healthy”) as the input signature. Top-ranked compounds from the priority list were then screened in the lab, which identified “hits” (compounds that actually resulted in the expected phenotypic change) and “non-hits” (those that did not). Top hits and non-hits then underwent paired measurements of phenotype and RNA expression (transcriptomics). These ground-truth measurements were then, crucially, fed back into the model to update the original target signature used as input.  

This results in a new target signature which is more accurately representative of the desired phenotypic change, ensuring a better priority list in the next run. This is the core feature of active learning: the model uses data to update its policy each time, maximising the improvement of the policy over several iterations. Incorporating lab-generated data into the ensemble’s training loop (“lab-in-the-loop”) as feedback is powerful because it directly reflects the real-world effects of the drug, which the model compares against expected effects to adjust its expectations. 

DrugReflector outperformed other existing approaches to match compounds and gene signatures by more than ten-fold.

DrugReflector outperformed other existing approaches to match compounds and gene signatures by more than ten-fold. Its utility and accuracy were revealed by evaluating previously published breast cancer and leukaemia datasets—DrugReflector prioritised compounds that are currently used as standard-of-care drugs for these diseases. Most excitingly of all, the model was able to refine the target signature for platelet development (megakaryopoiesis) and rank inducers of this signature, with a greater than 20-fold improvement in hit rate on subsequent screening. These experiments also led to new molecular insights: inhibition of cholesterol synthesis alone is sufficient to drive progenitor cells to the megakaryocyte lineage (which generates platelets).  

In short, DrugReflector improves the efficiency of the phenotypic drug discovery campaign, and its generalisability and optimisability through active learning could accelerate drug discovery for complex diseases. 

DrugReflector, developed by DeMeo et al, uses active “lab-in-the-loop” feedback to update target signatures associated with desired phenotypic changes, optimizing the prioritization of compounds for subsequent screening. Adapted from DeMeo et al

Top