gpuZoo: Cost-effective estimation of gene regulatory networks using the Graphics Processing Unit.

Marouen Ben Guebila, Daniel C Morgan, Kimberly Glass, Marieke L. Kuijjer, Dawn L. DeMeo, John Quackenbush.



Gene regulatory network inference allows for the study of transcriptional control to identify the alteration of cellular processes in human diseases. Our group has developed several tools to model a variety of regulatory processes, including transcriptional (PANDA, SPIDER) and post-transcriptional (PUMA) gene regulation, and gene regulation in individual samples (LIONESS). These methods work by performing repeated operations on data matrices in order to integrate information across multiple lines of biological evidence. This limits their use for large-scale genomic studies due to the associated high computational burden. To address this limitation, we developed gpuZoo, which includes GPU-accelerated implementations of these algorithms. The runtime of the gpuZoo implementation in MATLAB and Python is up to 61 times faster and 28 times less expensive than the multi-core CPU implementation of the same methods. gpuZoo takes advantage of the modern multi-GPU device architecture to build a population of sample-specific gene regulatory networks with similar runtime and cost improvements by combining GPU acceleration with an efficient on-line derivation. Taken together, gpuZoo allows parallel and on-line gene regulatory network inference in large-scale genomic studies with cost-effective performance. gpuZoo is available in MATLAB through the netZooM package and in Python through the netZooPy package

Supplementary data

This is the data used for the benchmarks of gpuZoo.

Model Motif PPI Expression Mode TFs Genes
Small network download download download Intersection 652 1000
Protein-coding genes network download download download Union 652 27149
Transcript network download download download Union 1603 43698


netZooPy, netZooM

GitHub repository

To reproduce the benchmarks, check the github repository gpuZoo.