The ground truth about metadata and community detection in networks

Science Advances
Leto PeelAaron Clauset

Abstract

Across many scientific domains, there is a common need to automatically extract a simplified view or coarse-graining of how a complex system's components interact. This general task is called community detection in networks and is analogous to searching for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called ground truth communities. This works well in synthetic networks with planted communities because these networks' links are formed explicitly based on those known communities. However, there are no planted communities in real-world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. We show that metadata are not the same as ground truth and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that there can be no algorithm that is optimal for all possible community detection tasks. However, community detection remains a powerful tool and node metadata still have value, so a careful e...Continue Reading

References

Jun 13, 2002·Proceedings of the National Academy of Sciences of the United States of America·M Girvan, M E J Newman
Mar 4, 2003·Bioinformatics·Petter HolmeHawoong Jeong
Sep 28, 2004·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Roger GuimeràLuís A Nunes Amaral
Feb 25, 2005·Nature·Roger Guimerà, Luís A Nunes Amaral
Jul 3, 2009·Proceedings of the National Academy of Sciences of the United States of America·Ginestra BianconiMatteo Marsili
Apr 7, 2010·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Andrea Lancichinetti, Santo Fortunato
May 21, 2010·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Benjamin H GoodAaron Clauset
Jun 22, 2010·Nature·Yong-Yeol AhnSune Lehmann
Mar 17, 2011·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Brian Karrer, M E J Newman
Sep 10, 2011·Physical Review Letters·Aurelien DecelleLenka Zdeborová
Nov 9, 2011·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Brian BallM E J Newman
Apr 12, 2012·Journal of Oral Rehabilitation·K KoyanoR Kuwatsuru
Aug 4, 2012·Nature·Ewen Callaway
Sep 26, 2012·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Tiago P Peixoto
Oct 17, 2013·PLoS Computational Biology·Daniel B LarremoreCaroline O Buckee
Nov 26, 2013·Molecular Biology and Evolution·Leanne S HaggertyJames O McInerney
Nov 28, 2013·Proceedings of the National Academy of Sciences of the United States of America·Florent KrzakalaPan Zhang
Mar 4, 2014·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Tiago P Peixoto
Aug 15, 2014·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Daniel B LarremoreAbigail Z Jacobs
Jan 24, 2015·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Darko HricSanto Fortunato
Feb 6, 2016·Journal of the American Statistical Association·Bailey K Fosdick, Peter D Hoff
Jun 17, 2016·Nature Communications·M E J Newman, Aaron Clauset
Aug 2, 2016·Scientific Reports·Zhao YangClaudio J Tessone

❮ Previous
Next ❯

Citations

Apr 21, 2018·Evolution; International Journal of Organic Evolution·José Aguilar-RodríguezJoshua L Payne
Jan 26, 2018·Nature Communications·Richard F BetzelDanielle S Bassett
Feb 21, 2018·Scientific Reports·Lucas G S JeubSanto Fortunato
Apr 24, 2019·PloS One·Thorben Funke, Till Becker
Feb 8, 2018·Royal Society Open Science·Laércio DiasEduardo G Altmann
Jul 12, 2019·Scientific Reports·Chien-Chun NiJie Gao
Oct 12, 2019·NPJ Systems Biology and Applications·Bede P BusbyPaul H Atkinson
Sep 15, 2019·Scientific Reports·Xiaoyan Lu, Boleslaw K Szymanski
Mar 20, 2018·Physical Review. E·Filippo Radicchi
Nov 25, 2019·Brain Structure & Function·Joshua Faskowitz, Olaf Sporns
Feb 12, 2020·Science Advances·Till HoffmannNick S Jones
Aug 17, 2020·Physical Review. E·Jelena SmiljanićMartin Rosvall
Mar 20, 2018·Physical Review. E·Tatsuro Kawamoto, Yoshiyuki Kabashima
Nov 23, 2018·Applied Network Science·Babak FotouhiDavid L Buckeridge
Jun 25, 2020·Physical Review. E·Aurelio PatelliGiulio Cimini
Jun 14, 2017·Scientific Reports·Tatsuro Kawamoto, Yoshiyuki Kabashima
Apr 4, 2018·Proceedings of the National Academy of Sciences of the United States of America·Leto PeelRenaud Lambiotte
Jul 20, 2018·Scientific Reports·Natalie StanleyPeter J Mucha
Mar 10, 2019·Proceedings of the National Academy of Sciences of the United States of America·Carey E PriebeEric Bridgeford
May 22, 2019·Physical Review. E·Mateusz WilinskiFabrizio Lillo
Oct 3, 2019·Physical Review. E·Sophie WharrieEduardo G Altmann
Oct 3, 2019·Physical Review. E·Scott Emmons, Peter J Mucha
Sep 30, 2020·Proceedings of the National Academy of Sciences of the United States of America·Roger Guimerà
Jun 14, 2019·Scientific Reports·Alexander J GatesYong-Yeol Ahn
Aug 31, 2018·Scientific Reports·Joshua FaskowitzOlaf Sporns
Sep 6, 2020·Proceedings of the National Academy of Sciences of the United States of America·Amir GhasemianAaron Clauset
Jan 21, 2021·Physical Review. E·Matteo CinelliJean-Charles Delvenne
Mar 4, 2020·NeuroImage·Richard F BetzelDaniel P Kennedy
Apr 4, 2020·NPJ Systems Biology and Applications·Bede P BusbyPaul H Atkinson
Mar 19, 2021·Science Advances·Ryan J GallagherBrooke Foucault Welles
Mar 27, 2021·BMC Bioinformatics·Andrew J Kavran, Aaron Clauset
Oct 21, 2020·Physical Review. E·Tzu-Chi Yen, Daniel B Larremore
Apr 2, 2021·Biometrics·Lucy L GaoJacob Bien
Apr 17, 2021·Waste Management·Roy CerquetiLeo Fulvio Minervini
May 11, 2021·Frontiers in Genetics·Genís Calderer, Marieke L Kuijjer
Jul 9, 2021·Science Advances·Philip S ChodrowAustin R Benson
Aug 31, 2021·Physical Review Letters·Mauro FaccinJean-Charles Delvenne
Sep 27, 2020·Scientific Reports·Martina ContiscianiCaterina De Bacco
Apr 18, 2018··Tanmoy ChakrabortyNoseong Park

❮ Previous
Next ❯

Software Mentioned

BESTest
neoSBM

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.

Related Papers

Nature Communications
M E J Newman, Aaron Clauset
Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics
Jernej Bodlaj, Vladimir Batagelj
Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics
Darko HricSanto Fortunato
Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics
Brian BallM E J Newman
© 2022 Meta ULC. All rights reserved