Using ontology embeddings for structural inductive bias in gene expression data analysis

by   Maja Trȩbacz, et al.

Stratifying cancer patients based on their gene expression levels allows improving diagnosis, survival analysis and treatment planning. However, such data is extremely highly dimensional as it contains expression values for over 20000 genes per patient, and the number of samples in the datasets is low. To deal with such settings, we propose to incorporate prior biological knowledge about genes from ontologies into the machine learning system for the task of patient classification given their gene expression data. We use ontology embeddings that capture the semantic similarities between the genes to direct a Graph Convolutional Network, and therefore sparsify the network connections. We show this approach provides an advantage for predicting clinical targets from high-dimensional low-sample data.


page 1

page 2

page 3

page 4


Identify Statistical Similarities and Differences Between the Deadliest Cancer Types Through Gene Expression

Prognostic genes have been well studied within each type of cancer. Howe...

Graph-Conditioned MLP for High-Dimensional Tabular Biomedical Data

Genome-wide studies leveraging recent high-throughput sequencing technol...

Towards Gene Expression Convolutions using Gene Interaction Graphs

We study the challenges of applying deep learning to gene expression dat...

Elephant Search with Deep Learning for Microarray Data Analysis

Even though there is a plethora of research in Microarray gene expressio...

Gene Expression based Survival Prediction for Cancer Patients: A Topic Modeling Approach

Cancer is one of the leading cause of death, worldwide. Many believe tha...

Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression

Motivation: The discovery of relationships between gene expression measu...

Please sign up or login with your details

Forgot password? Click here to reset