Oct 28, 2016

TCGA Expedition: A Data Acquisition and Management System for TCGA Data

PloS One
Uma R ChandranRebecca S Jacobson

Abstract

The Cancer Genome Atlas Project (TCGA) is a National Cancer Institute effort to profile at least 500 cases of 20 different tumor types using genomic platforms and to make these data, both raw and processed, available to all researchers. TCGA data are currently over 1.2 Petabyte in size and include whole genome sequence (WGS), whole exome sequence, methylation, RNA expression, proteomic, and clinical datasets. Publicly accessible TCGA data are released through public portals, but many challenges exist in navigating and using data obtained from these sites. We developed TCGA Expedition to support the research community focused on computational methods for cancer research. Data obtained, versioned, and archived using TCGA Expedition supports command line access at high-performance computing facilities as well as some functionality with third party tools. For a subset of TCGA data collected at University of Pittsburgh, we also re-associate TCGA data with de-identified data from the electronic health records. Here we describe the software as well as the architecture of our repository, methods for loading of TCGA data to multiple platforms, and security and regulatory controls that conform to federal best practices. TCGA Expedition s...Continue Reading

  • References24
  • Citations20

Mentioned in this Paper

Computer Software
Patient Portals
Acrocallosal Syndrome
Electronic Health Records
Size
Research
Environmental Infrastructure
Protein Methylation
Genome
Transcription, Genetic

Related Feeds

Cancer Genomics (Keystone)

Cancer genomics approaches employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research using such technologies in this feed.

Cancer Genomics

Cancer genomics employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research here.

Related Papers

Database : the Journal of Biological Databases and Curation
Junjun ZhangArek Kasprzyk
IEEE Transactions on Information Technology in Biomedicine : a Publication of the IEEE Engineering in Medicine and Biology Society
David B KeatorBIRN-Coordinating
© 2020 Meta ULC. All rights reserved