In silico approach to designing rational metagenomic libraries for functional studies

BMC Bioinformatics
Anna Kusnezowa, Lars I Leichert

Abstract

With the development of Next Generation Sequencing technologies, the number of predicted proteins from entire (meta-) genomes has risen exponentially. While for some of these sequences protein functions can be inferred from homology, an experimental characterization is still a requirement for the determination of protein function. However, functional characterization of proteins cannot keep pace with our capabilities to generate more and more sequence data. Here, we present an approach to reduce the number of proteins from entire (meta-) genomes to a reasonably small number for further experimental characterization without loss of important information. About 6.1 million predicted proteins from the Global Ocean Sampling Expedition Metagenome project were distributed into classes based either on homology to existing hidden markov models (HMMs) of known families, or de novo by assessment of pairwise similarity. 5.1 million of these proteins could be classified in this way, yielding 18,437 families. For 4,129 protein families, which did not match existing HMMs from databases, we could create novel HMMs. For each family, we then selected a representative protein, which showed the closest homology to all other proteins in this famil...Continue Reading

References

Jan 1, 1987·Applied and Environmental Microbiology·G Kouker, K E Jaeger
Jul 24, 2002·Nucleic Acids Research·Kazutaka KatohTakashi Miyata
Sep 27, 2002·Current Opinion in Biotechnology·Karl-Erich Jaeger, Thorsten Eggert
Jan 10, 2003·Nucleic Acids Research·Daniel H HaftOwen White
Jun 28, 2005·Nucleic Acids Research·Andreas GroteDieter Jahn
May 8, 2007·Nature Biotechnology·Eugene V Koonin
Aug 30, 2007·BMC Bioinformatics·Rodrigo Gouveia-OliveiraAnders G Pedersen
Feb 11, 2011·The ISME Journal·Itai SharonOded Béjà
Nov 1, 2011·PLoS Computational Biology·Sean R Eddy
Dec 1, 2011·Nucleic Acids Research·Marco PuntaRobert D Finn
Dec 7, 2011·Methods in Molecular Biology·Stijn van Dongen, Cei Abreu-Goodger
Jan 25, 2012·PloS One·Martin WuJonathan A Eisen
Jan 29, 2013·Nature Methods·Predrag RadivojacIddo Friedberg
Sep 10, 2013·PLoS Biology·Brian P AntonSimon Kasif
Dec 10, 2013·Nucleic Acids Research·Tatiana TatusovaIgor Tolstoy
Oct 29, 2014·Nucleic Acids Research·UNKNOWN UniProt Consortium
Nov 28, 2014·Nucleic Acids Research·J Rodney BristerOlga Blinkova
Nov 28, 2014·Nucleic Acids Research·Alex MitchellRobert D Finn
Nov 4, 2015·Frontiers in Microbiology·Thorsten MasuchLars I Leichert

❮ Previous
Next ❯

Citations

Sep 14, 2018·Frontiers in Microbiology·Premankur SukulLars I Leichert
Dec 29, 2020·Frontiers in Microbiology·Alinne L R Santana-PereiraMark R Liles
Nov 23, 2019·Biochimica Et Biophysica Acta. Proteins and Proteomics·Janaina Marques AlmeidaNadia Krieger
Jun 22, 2021·Frontiers in Microbiology·Dennis SanderLars I Leichert

❮ Previous
Next ❯

Datasets Mentioned

BETA
13694

Software Mentioned

Jasco
blastp
PFAM
TIGRFAMs
BLAST
Ubuntu
MCL
Linux
MAFFT
alistat

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.

Related Papers

Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
W N Grundy
© 2022 Meta ULC. All rights reserved