Shared Resource Spotlight: Biomedical Informatics

April 20, 2020

By Aris Floratos, PhD, Director of the Biomedical Informatics Shared Resource at the Herbert Irving Comprehensive Cancer Center

The Biomedical Informatics Shared Resource (BISR) provides our members with access to specialized personnel and high-performance computational infrastructure that supports integrative analyses of high-throughput data, such as transcriptomic, genomic, epigenomic, and proteomic data. These analyses have become an essential part of basic and translational cancer research. They are, however, complex and require the use of expensive personnel and IT resources that most labs do not have the capacity or the need to invest in. For such labs, the BISR offers a cost-effective way to gain access to technologies that can have a significant impact on their research.

Cutting-edge services for omics data analysis

Aris Floratos

Aris Floratos, PhD, Director of the Biomedical Informatics Shared Resource at the HICCC

The BISR offers a number of services to cancer center members. We provide assistance with study design, particularly platform and sample size selection to ensure that downstream analyses are properly powered. Our wide range of computational workflows are based on best practices for the analysis of omics data (e.g., differential gene expression analysis, gene set enrichment analyses, analysis of genome-wide genotypic and exome sequence data, analysis of single-cell RNA-Seq data). Our specialized personnel can help with authoring the informatics portion of grant applications and manuscripts and support with data submission to NIH repositories. The BISR also offers access to the high-performance computing environment of the Department of Systems Biology (DSB).

Importantly, the BISR brings to our members’ projects innovative systems methodologies developed at the DSB by faculty such as Andrea Califano, Raul Rabadan, Barry Honig, and others. These methodologies include approaches for the discovery of master regulator genes that mediate cancer establishment, maintenance, and progression as well as response to treatment. These are powerful methodologies that have been used with remarkable success to elucidate regulatory and signaling programs that play key roles in coordinating cell-state transitions across many cancer phenotypes. Through the BISR, cancer center members get the opportunity to employ these advanced methods in their own research.

Evolving technologies to support new frontiers in sequencing data

As technologies evolve and new methods of analysis emerge, BISR services are updated to meet the evolving needs of HICCC members through adoption of new methodologies appearing in the literature. BISR personnel are currently experimenting with novel tools recently developed by DSB faculty to support the analysis of scRNA-Seq data. The Rabadan laboratory has developed scTDA (single-cell topological data analysis), a topology-based statistical framework for the detection of transient cellular populations and their transcriptional repertoires. MetaVIPER (from the Califano laboratory) is a network biology approach that leverages tumor- and tissue-specific gene regulatory networks to transform low depth scRNA-seq profiles into highly reproducible protein activity profiles. These profiles accurately reflect cell state, thus significantly increasing the ability to analyze the biological function and relevance of gene products whose mRNAs are undetectable in low-depth, scRNA-seq data (dropout effect).

We expect that the steady decline in the cost of sequencing technologies will lead to more widespread use of whole genome sequencing for the profiling of mutational and structural variation. This will present opportunities for characterizing the contribution on coding and non-coding variation to cancer establishment and progression, as well as challenges related to the storage and mining of very large datasets. In this area, new computational techniques are already appearing that aim to facilitate the combined representation and exploration of multiple genomes (e.g., variation graphs). Additional data modalities that the BISR is prepared to support include those associated with emergent technology platforms such as mass cytometry, mass microscopy, and single cell genomics.