OTTER

Reference

Gene Regulatory Network Inference as Relaxed Graph Matching.

Deborah Weighill, Marouen Ben Guebila, Camila Lopes-Ramos, Kimberly Glass, John Quackenbush, John Platig, Rebekka Burkholz.

bioRxiv

doi.org/10.1101/2020.06.23.167999.

Abstract

Gene regulatory network inference is instrumental to the discovery of genetic mechanisms driving diverse diseases, including cancer. Here, we present a theoretical framework for PANDA, an established method for gene regulatory network inference. PANDA is based on iterative message passing updates that resemble the gradient descent of an optimization problem, OTTER, which can be interpreted as relaxed inexact graph matching between a gene-gene co-expression and a protein-protein interaction matrix. The solutions of OTTER can be derived explicitly and inspire an alternative spectral algorithm, for which we can provide network recovery guarantees. We compare different solution approaches of OTTER to other inference methods using three biological data sets, which we make publicly available to offer a new application venue for relaxed graph matching in gene regulatory network inference. We find that using modern gradient descent methods with superior convergence properties solving OTTER outperforms state-of-the-art gene regulatory network inference methods in predicting binding of transcription factors to regulatory regions.

Supplementary data

Several OTTER networks are available in GRAND database. In addition, the raw data for reconstruction and benchmarking of the networks are provided below.

File name Description Link
expressed_genes_tissue.txt Column (gene) names of the gene regulatory matrix W or the initial guess W0. They are also the node names of the correlation matrix C. breast, cervix, liver, liver_tcga_gtex
expressed_tf_names_tissue.txt Row (TF) names of the gene regulatory matrix W or the initial guess W0. They are also the node names of the protein-protein interaction matrix P. breast, cervix, liver, liver_tcga_gtex
motif_prior_matrix_tissue_otter.txt Initial gene regulatory network W0, which was constructed based on TF binding motifs in the human reference genome. Row names: TF names in order of file expressed_tf_names_tissue.txt Column names: Gene names in order of file expressed_genes_tissue.txt breast, cervix, liver, liver_tcga_gtex
PPI_matrix_tissue.txt Protein-protein interaction matrix P. The node names are provided in the file expressed_tf_names_tissue.txt breast, cervix, liver, liver_tcga_gtex
tcga_tissue_TPM_otter.txt Gene expression data. Columns refer two samples (i.e. people) and rows to genes. The gene names are defined in expressed_genes_tissue.txt. This data is used to compute the correlation matrix C. breast, cervix, tcga_liver
corTissue.csv Correlation matrix C. Rows and columns refer to genes with names defined in expressed_genes_tissue.txt breast, cervix, liver
chipseq_postive_edges_tissue.txt Validation set. Existing edges between TFs and genes. All TFs in the first column are tested with all genes. Thus, if an edge between any TFs in the first column and any other gene is not listed, it was not measured and counts as non-existent in the validation. breast, cervix, liver
otterTissue.txt Inferred gene regulatory network by optimizing OTTER with gradient descent. breast, cervix, liver
otterLiverTCGA.csv Inferred gene regulatory network by optimizing OTTER with gradient descent. Liver cancer tissue. For comparison with the normal tissue network (otterLiverGTEX.csv), it has the same set of nodes. liver_tcga
otterLiverGTEX.csv Inferred gene regulatory network by optimizing OTTER with gradient descent. Normal liver tissue. For comparison with the liver cancer tissue network (otterLiverTCGA.csv), it has the same set of nodes. liver_gtex
Supplementary_Figures.zip Result figures of GO term enrichment analysis comparing TCGA and GTEx OTTER networks Supplementary figures
Supplementary_Tables.zip Result tables of GO term enrichment analysis comparing TCGA and GTEx OTTER networks Supplementary tables
Otter_AAAI2021-12.pdf Supplementary material file Supplementary material

Implementation

netZooR, netZooPy, netZooM

Netbook tutorials

The following netbooks use OTTER:

  • netZooR:

    • Inferring Gene Regulatory Networks from GTEx Gene Expression Data in R with OTTER