DIVCLUS: an automatic method in the GEANFAMMER package that finds homologous domains in single- and multi-domain proteins

Bioinformatics
J Park, Sarah A Teichmann

Abstract

Large-scale determination of relationships between the proteins produced by genome sequences is now common. All protein sequences are matched and those that have high match scores are clustered into families. In cases where the proteins are built of several domains or duplication modules, this can lead to misleading results. Consider the very simple example of three proteins: 1, formed by duplication modules A and B; 2, formed by duplication modules B' and C; and 3, formed by duplication modules C' and D. Duplication modules B and B' are homologous, as are C and C'. Matching the sequences of 1, 2 and 3 followed by simple single-linkage clustering would put all three in the same family, even though proteins 1 and 3 are not related. This is because the different parts of 2 match 1 and 3. This paper describes a procedure, DIVCLUS, that divides such complex clusters of partially related sequences into simple clusters that contain only related duplication modules. In the example just given, it would produce two groups of sequences: the first with domains B of sequence 1 and B of sequence 2, and the second with domain C of sequence 2 and C of sequence 3. DIVCLUS is part of a package called GEANFAMMER, for GEnome ANalysis and protein ...Continue Reading

Citations

Nov 7, 2000·Progress in Biophysics and Molecular Biology·A Heger, L Holm
Jan 8, 2000·Computers & Chemistry·J P CometJ J Codani
Nov 17, 2001·Trends in Biotechnology·S A TeichmannC Chothia
Jun 8, 2007·Nucleic Acids Research·Jianlin Cheng
Dec 14, 2002·Plant Physiology·Yong-Li XiaoChristopher D Town
Jul 4, 2001·Journal of Virology·E R TulmanD L Rock
Dec 26, 2001·Journal of Virology·C L AfonsoD L Rock
Mar 10, 2006·Molekuliarnaia biologiia·O V GalzitskaiaS A Garbuzinskiĭ
Oct 8, 2009·Molekuliarnaia biologiia·D G Naumov, M Karreras
Apr 30, 2004·BMC Bioinformatics·Timothy J HarlowMark A Ragan
May 7, 2004·Genome Biology·Alastair GrantChristine Orengo
Nov 26, 2013·Molecular Biology and Evolution·Leanne S HaggertyJames O McInerney
Dec 9, 1998·Proceedings of the National Academy of Sciences of the United States of America·S A TeichmannC Chothia
Mar 26, 2002·FEBS Letters·Richard R CopleyPeer Bork
Sep 12, 2015·IEEE/ACM Transactions on Computational Biology and Bioinformatics·David L González-ÁlvarezÁlvaro Rubio-Largo
Jul 4, 2006·Protein Science : a Publication of the Protein Society·Amit OberaiJames U Bowie
Jun 11, 1999·Current Opinion in Structural Biology·S A TeichmannM Gerstein
Jul 25, 2006·Systematic Biology·E Kurt LienauPaul J Planet
Mar 4, 2000·Journal of Molecular Biology·S A Teichmann, C Chothia
Apr 5, 2002·Protein Engineering·Dmitrij Frishman
Oct 7, 2005·BioTechniques·Rimantas Sapranauskas, Arvydas Lubys
Jan 10, 2002·Genome Research·Berend SnelMartijn A Huynen

❮ Previous
Next ❯

Related Concepts

Related Feeds

CZI Human Cell Atlas Seed Network

The aim of the Human Cell Atlas (HCA) is to build reference maps of all human cells in order to enhance our understanding of health and disease. The Seed Networks for the HCA project aims to bring together collaborators with different areas of expertise in order to facilitate the development of the HCA. Find the latest research from members of the HCA Seed Networks here.

Related Papers

Protein Science : a Publication of the Protein Society
R M AdamsT F Smith
Protein Science : a Publication of the Protein Society
E L Sonnhammer, D Kahn
© 2022 Meta ULC. All rights reserved