Accurate inference of kinase activity from human tumor phosphoproteomics data nominates active kinases as therapuetic targets
Introduction
Aberrant kinase activity has been implicated in various cellular processes that promote cancer development. Specifically, protein kinases regulate key players in these processes via phosphorylation, and thus are frequently targeted for inhibition by small molecules. Accurately assessing kinase activity is critical for the identification of novel cancer drug targets, development of patient specific therapeutics, and prediction of treatment outcomes. Various methods for inferring kinase activity using tumor phosphoproteomics data from mass spectrometry have been described and typically rely on measurements for known kinase target phosphosites. However, the number of known target sites is limited, and there is no consensus around which method to use. Efforts to systematically evaluate kinase activity inference (KAI) methods have been limited to cell lines treated with kinase inhibitors that have variable specificity. CPTAC has generated a comprehensive resource of multi-omics data for multiple cancer types. Here, we compare multiple published KAI methods, benchmark these methods against a gold standard established using CPTAC proteomics data from 757 tumors human across seven cancer types, and assess the ability of top performing methods to identify therapeutic kinase targets.
Methods
CPTAC pan-cancer datasets included harmonized somatic mutation, copy number, proteomics, and phosphoproteomics data. Single sample kinase activity inference from site median-centered phosphorylation data was implemented in R for several published methods as well as simple metrics that aggregate data for the set of target sites or compare target sites to non-target sites in a given sample. For AUROC analysis of kinase activity Z-scores, samples in the top and bottom 5% for each kinase (protein) in each cancer type were selected as kinase-tumor pairs for gold standard positive (GS+) and negative (GS-) sets, respectively. To identify therapeutic targets for samples with genetic aberrations, XGboost was used to predict kinase activity scores from mutation and copy number data for significantly frequently mutated genes and established tumor suppressors and oncogenes.
Results
The best performing published KAI methods, which included RoKAI, KSEA, and PTM-SEA, yielded results that were comparable to a simple method computing the mean of relative levels of known kinase targets in each sample. Adding computationally predicted target sites provided a substantial boost in the number of kinases that can be assessed and in performance for kinases with known substrates. When the best performing method was applied to a phosphoproteomics dataset from cell lines (Frejno, et al., Nature Communications, 2020), higher kinase activities were consistently associated with better responses to kinase inhibitors than kinase protein levels or activity scores calculated using the established KSEA method with known targets alone. Finally, by using machine learning to predict kinase activity using somatic genetic aberration data from the same set of tumors and extracting the driving features, we identified candidate therapeutic targets (kinases with elevated activity) for tumors with potential MYC, CCND1, and EGFR gain-of-function (missense mutation or gene amplification) aberrations, TP53, CDKN2A, RB1, and PTEN loss-of-function aberrations (frameshift or truncation mutation or deletion), and SETDB1, SETD2, KDM5C, and STK11 mutations, among others. For example, TP53 loss-of-function was the feature best informing ATM activity prediction, suggesting that ATM may be a viable therapeutic target in tumors with TP53 mutations.
Conclusion
The set of top performing methods for kinase activity inference yielded similar results as simply taking the mean of the relative levels of known plus predicted targets in a given sample and faciltated the identification of cell lines and tumors with elevated kinase activity that could potentially be targeted therapeutically.