How frequently do clusters occur in hierarchical clustering analysis? A graph theoretical approach to studying ties in proximity

Journal of Cheminformatics
Wilmer LealManuel Elkin Patarroyo

Abstract

Hierarchical cluster analysis (HCA) is a widely used classificatory technique in many areas of scientific knowledge. Applications usually yield a dendrogram from an HCA run over a given data set, using a grouping algorithm and a similarity measure. However, even when such parameters are fixed, ties in proximity (i.e. two equidistant clusters from a third one) may produce several different dendrograms, having different possible clustering patterns (different classifications). This situation is usually disregarded and conclusions are based on a single result, leading to questions concerning the permanence of clusters in all the resulting dendrograms; this happens, for example, when using HCA for grouping molecular descriptors to select that less similar ones in QSAR studies. Representing dendrograms in graph theoretical terms allowed us to introduce four measures of cluster frequency in a canonical way, and use them to calculate cluster frequencies over the set of all possible dendrograms, taking all ties in proximity into account. A toy example of well separated clusters was used, as well as a set of 1666 molecular descriptors calculated for a group of molecules having hepatotoxic activity to show how our functions may be used f...Continue Reading

References

Feb 24, 2001·Journal of Chemical Information and Computer Sciences·J MacCuishN E MacCuish
Jan 27, 2004·Journal of Chemical Information and Computer Sciences·Guillermo RestrepoJosé L Villaveces
Mar 3, 2004·Genome Research·Susanne PrinzTimothy Galitski
Sep 18, 2004·Bioinformatics·Vicente ArnauIgnacio Marín
May 23, 2006·Journal of Chemical Information and Modeling·Dariusz PlewczynskiUwe Koch
May 1, 2007·Journal of Chemical Information and Modeling·Guillermo RestrepoEugenio J Llanos
Sep 26, 2008·Journal of Chemical Information and Modeling·Osvaldo A Santos-Filho, Artem Cherkasov
Jun 25, 2010·Journal of Chemical Information and Modeling·Denis FourchesAlexander Tropsha
Oct 5, 2011·Journal of the American Chemical Society·Hengwei LinKenneth S Suslick
Mar 20, 2012·Journal of Cheminformatics·Martin GütleinStefan Kramer
Dec 19, 2012·Journal of Cheminformatics·Faisal SaeedAmmar Abdo
Apr 4, 2014·Journal of the American Chemical Society·Kate J AkermanOrde Q Munro
Sep 30, 2014·Journal of Cheminformatics·Ctibor SkutaDaniel Svozil
Apr 14, 2015·Journal of Cheminformatics·Alberto GobbiMan-Ling Lee
Jul 8, 2015·Journal of Cheminformatics·Sunghwan KimStephen H Bryant

❮ Previous
Next ❯

Citations

Nov 1, 2018·The Cleft Palate-craniofacial Journal : Official Publication of the American Cleft Palate-Craniofacial Association·Qiang ZhangZhengwei Yuan

❮ Previous
Next ❯

Software Mentioned

Common Lisp ( CL ) Common Lisp
LTKB
Dragon
GAWK
Linux

Related Concepts

Related Feeds

Adenoma, Liver Cell

Liver Cell Adenoma or hepatic adenoma is a rare benign tumor. It is associated with birth control use or pregnancy. Discover the latest research on Liver Cell Adenoma here.

Related Papers

Journal of Chemical Information and Computer Sciences
J MacCuishN E MacCuish
Journal of Chemical Information and Modeling
Guillermo RestrepoEugenio J Llanos
IEEE/ACM Transactions on Computational Biology and Bioinformatics
William W L Wong, Forbes J Burkowski
IEEE Transactions on Pattern Analysis and Machine Intelligence
Jorge M SantosLuis A Alexandre
© 2021 Meta ULC. All rights reserved