Abstract
Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the 'Relief' algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Training Environment). We apply a comprehensive genetic simulation study comparing existing RBAs, a proposed RBA called MultiSURF, and other established feature selection methods, over a variety of problems. The results of this study (1) support the assertion that RBAs are particularly flexible, efficient, and powerful feature selection methods that differentiate relevant features having univariate, multivariate, epistatic, or heterogeneous associations, (2) confirm the efficacy of expansions for classification vs. regression, discrete vs. continuous features, missing data...Continue Reading
Citations
Sep 22, 2018·Bioinformatics·Trang T LeBrett A McKinney
Mar 13, 2019·Genetic Epidemiology·Somayeh KafaieTing Hu
Jan 14, 2020·Bioinformatics·Trang T LeBrett A McKinney
Sep 12, 2019·PloS One·Mingyuan Wang, Adrian Barbu
May 21, 2020·Scientific Reports·Andrius VabalasAlexander J Casson
Jul 13, 2019·BMC Medical Genomics·Sehee WangKyung-Ah Sohn
Jan 25, 2019·Oncology Letters·Xue Hu, Zebo Yu
Dec 14, 2019·Plant Methods·Adnan ZahidQammer H Abbasi
Oct 3, 2020·Scientific Reports·In-Soo KimBoyoung Joung
Nov 9, 2018·Frontiers in Microbiology·E Ernestina Godoy-LozanoLiliana Pardo-Lopez
Nov 7, 2018·PeerJ·Faramarz DoraniGuangju Zhai
Aug 11, 2020·Frontiers in Genetics·Marziyeh ArabnejadBrett A McKinney
Feb 6, 2020·BMC Medical Informatics and Decision Making·Davide Chicco, Giuseppe Jurman
Dec 8, 2020·PloS One·David S CampoYury Khudyakov
Dec 9, 2020·Nucleic Acids Research·Alexander MitrofanovRolf Backofen
Feb 9, 2021·PloS One·Bryan A DawkinsBrett A McKinney
Dec 6, 2020·Journal of Neuroengineering and Rehabilitation·Ha TranDavid J Szmulewicz
Dec 5, 2020·Journal of Healthcare Engineering·Baochao FanChangjun Wang
Feb 27, 2021·ACS Sensors·Sujay K BiswasSuman Chakraborty
Mar 2, 2021·PloS One·Saurav BoseAaron J Masino
Apr 4, 2021·Journal of Personalized Medicine·Fajar JavedAhmed Waqas
Apr 4, 2021·International Journal of Environmental Research and Public Health·Maikel Luis KollingLeonel Pablo Carvalho Tedesco
Apr 18, 2021·Applied and Environmental Microbiology·Gizem LeventH Morgan Scott
Oct 16, 2020·Briefings in Bioinformatics·Pengyi ZhangShuyan Li
Jul 10, 2021·NPJ Science of Food·Fei XuGuangtao Zhang
Jul 10, 2021·RNA Biology·Martin RadenSeija Lehnardt
Aug 8, 2021·Cancers·Han Na JangJung Hee Kim
Aug 29, 2021·Sensors·Moumita MandalRam Sarkar
Oct 7, 2021·PloS One·Md Nazmul HaqueMohammad Shoyaib
Nov 11, 2021·Journal of Chemical Information and Modeling·Yun Hao, Jason H Moore
Jan 15, 2022·Medical & Biological Engineering & Computing·Xiongshi DengLei Wang
Jun 18, 2021··Ziawasch AbedjanFelix Biessmann
Apr 6, 2021··Alp Sahin, Xiangrui Zeng
Aug 26, 2020··Moshe SipperRyan J. Urbanowicz