Identification of novel therapeutic targets is the first step in the drug development journey, and traditionally relies on deep understanding of the biology, a process which is lengthy and non-scalable. The advent of high-throughput screening and advanced machine learning methods enables rapidly uncovering novel therapeutic targets even where the biology is not yet well understood. This, in turn, allows the expansion of the universe of targets and even the reach of current compounds.
Using the CancerRx dataset of 987 human cell-lines screened across 346 compounds we train a GBDT to predict the sensitivity (IC50) of each cell-line to each drug. Each cell-line is represented as a vector of binary values (WT/MT) according to the genomic data of 299 cancer related genes. Once high accuracy of predicting the cell-line-drug IC50 is obtained we use SHAP values, which represent the extent of a feature's responsibility for a change in the model output, to obtain the genetic features which influence sensitivity the most. These features, specific mutations and combinations of mutations, represent potential therapeutic targets of each screened drug.
We report here a validation set consisting of 8 drugs with well-known targets and/or biomarkers. In all cases we identify the correct drug-target without any use of the underlying biology or prior mechanistic knowledge. We then report additional genomic features the model predicts to be important for the sensitivity to specific drugs, representing potential targets which to date are unknown. Additionally, we identify specific genetic features that increase the resistance to specific compounds.
Using an unbiased GBDT-based ML algorithm on publicly available high throughput screening data we accurately identify known drug-targets as well as additional previously unknown candidates. These new findings can expand the patient population for whom a certain drug can provide benefit. Furthermore, due to the unbiased nature of the algorithm and the wide array of drugs and drug-targets used in the screen we identify additional potential drugs targets which are not being targeted by the current drugs.
The authors.
Fore Biotherapeutics.
I.K. Kifer, G. Tarcic: Financial Interests, Personal, Full or part-time Employment: NovellusDx. E. Goldfarb, M. Vidne: Financial Interests, Personal, Stocks/Shares: NovellusDx. All other authors have declared no conflicts of interest.