Oct 31, 2018

Joint analysis of gene expression levels and histological images identifies genes associated with tissue morphology

BioRxiv : the Preprint Server for Biology
Jordan AshBarbara E Engelhardt

Abstract

Histological images are used to identify and to characterize complex phenotypes such as tumor stage. Our goal is to associate histological image phenotypes with high-dimensional genomic markers; the limitations to incorporating histological image phenotypes in genomic studies are that the relevant image features are difficult to identify and extract in an automated way, and confounders are difficult to control in this high-dimensional setting. In this paper, we use convolutional autoencoders and sparse canonical correlation analysis (CCA) on histological images and gene expression levels from paired samples to find subsets of genes whose expression values in a tissue sample correlate with subsets of morphological features from the corresponding sample image. We apply our approach, ImageCCA, to three data sets, two from TCGA and one from GTEx v6, and we find three types of biological associations. In TCGA, we find gene sets associated with the structure of the extracellular matrix and cell wall infrastructure, implicating uncharacterized genes in extracellular processes. Across studies, we find sets of genes associated with specific cell types, including muscle tis- sue and neuronal cells, and with cell type proportions in heter...Continue Reading

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Computer Software
Biological Markers
Study
The Cancer Genome Atlas
2-Dimensional
Environmental Infrastructure
Genome
Extracellular
Genes
Science of Morphology

About this Paper

Related Feeds

Cancer Genomics (Keystone)

Cancer genomics approaches employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research using such technologies in this feed.

CZI Human Cell Atlas Seed Network

The aim of the Human Cell Atlas (HCA) is to build reference maps of all human cells in order to enhance our understanding of health and disease. The Seed Networks for the HCA project aims to bring together collaborators with different areas of expertise in order to facilitate the development of the HCA. Find the latest research from members of the HCA Seed Networks here.

Cancer Genomics (Preprints)

Cancer genomics employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest preprints here.

Cancer Genomics

Cancer genomics employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research here.

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.