Reciprocal Perspective for Protein Interaction Prediction and Related Complete Graph Problems

Photo of Kevin Dick

Kevin Dick

PhD Candidiate (Biomedical Engineering), Department of Systems and Computer Engineering, Carleton University

April 5, 2018 16:00 - 17:30

Canal Building Room 2104, Carleton University

Contact: Dr. James Green

Paid parking available on campus

abstract

A number of scientific fields are experiencing a big data revolution enabling researchers to appraise elements and their relationships in the context of all others. The prediction of protein-protein interactions (PPIs), prediction of protein function, and the prediction of patient health outcomes are examples of three such fields. Each involves the training, validation, and testing of classifiers as well as the selection of a globally applied decision threshold. For PPI prediction, computational methods have been leveraged to rapidly estimate protein interaction networks and recently comprehensive interactomes (i.e. all proteins in a protein predicted against all others) have been elucidated through advances in high performance computing. Historically, these fields exhibited an intrinsic need to apply a threshold which distinguishes the set of high-quality or considerable PPIs from the set of low-quality or unlikely PPIs. These thresholds have at times been arbitrarily defined and the source of considerable debate in the literature. While the determination of a uniquely defined threshold has been shown to be successful in many contexts, further investigation into the comprehensive prediction of all PPIs reveals that such a threshold is inappropriate for many proteins thereby necessitating a more flexible definition. The field of protein function prediction and disease prediction from electronic health records face similar challenges and are further exacerbated by the incompleteness and biases inherent in their datasets.

In this work, we propose a novel method called Reciprocal Perspective (RP) which is a cascaded feature extraction and prediction layer leveraging the comprehensive set of prediction scores and jointly considers the global and local thresholds in prediction tasks. In a recent study, we predicted the comprehensive set of intra-species interactions in Homo sapiens and Saccharomyces cerevisea using the Protein-protein Interaction Prediction Engine and the Scoring PRotein INTeractions methods and demonstrated that the post hoc application of RP led to outperformance of state-of-the-art methods alone in each case. The use of RP for protein function prediction and disease prediction is under active investigation in conjunction with several related complete graph problems. RP promises to be applicable to any complete graph (or complete bipartite graph) prediction problems in this new age of big data and high performance computing research.

biography

Kevin Dick is currently pursuing a PhD in biomedical engineering specializing in data science and bioinformatics at Carleton University in Ottawa, Canada. He received a B.Sc. degree in biology and computer science in 2014 at McGill University and spent a year working in the R&D department of Revision Military in Ottawa prior to joining the Dr. James Green Lab in 2015. He is a Queen Elizabeth II Scholar and his research interests include data science, artificial intelligence, machine learning, high performance computing, medical informatics, and intelligent system design. As a polymath, Kevin's interests extend throughout the arts and sciences in earnest desire to apply his creativity. Kevin is currently the Co-Chair of the Carleton University Engineering in Medicine and Biology Society Student Chapter, recipient of a Mitacs Globalink Research Grant to study Malaria in Mumbai, India, and is the recent recipient of the Dr. Ross Koningstein scholarship.

Last updated April 3, 2018