Proteogenomic annotation of the Chinese hamster reveals extensive novel translation events and endogenous retroviral elements

BioRxiv : the Preprint Server for Biology
Shangzhong LiNathan E Lewis

Abstract

A high quality genome annotation greatly facilitates successful cell line engineering. Standard draft genome annotation pipelines are based largely on de novo gene prediction, homology, and RNA-Seq data. However, draft annotations can suffer from incorrectly predictions of translated sequence, incorrect splice isoforms and missing genes. Here we generated a draft annotation for the newly assembled Chinese hamster genome and used RNA-Seq, proteomics, and Ribo-Seq to experimentally annotate the genome. We identified 4,333 new proteins compared to the hamster RefSeq protein annotation and 2,503 novel translational events (e.g., alternative splices, mutations, novel splices). Finally, we used this pipeline to identify the source of translated retroviruses contaminating recombinant products from Chinese hamster ovary (CHO) cell lines, including 131 type-C retroviruses, thus enabling future efforts to eliminate retroviruses by reducing the costs incurred with retroviral particle clearance. In summary, the improved annotation provides a more accurate platform for guiding CHO cell line engineering, including facilitating the interpretation of omics data, defining of cellular pathways, and engineering of complex phenotypes.

Related Concepts

Genes
Genome
Ovary
Recombinant Proteins
Retroviridae
RNA
Chinese Hamster Ovary Cell
Cell Line, Tumor
Chinese People
Tissue Engineering

Related Feeds

CZI Human Cell Atlas Seed Network

The aim of the Human Cell Atlas (HCA) is to build reference maps of all human cells in order to enhance our understanding of health and disease. The Seed Networks for the HCA project aims to bring together collaborators with different areas of expertise in order to facilitate the development of the HCA. Find the latest research from members of the HCA Seed Networks here.

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.