Rough sets and Laplacian score based cost-sensitive feature selection

PloS One
Shenglong Yu, Hong Zhao

Abstract

Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the mini...Continue Reading

Citations

Apr 5, 2019·BMC Bioinformatics·Yosef Masoudi-SobhanzadehAli Masoudi-Nejad

Related Concepts

Heuristics
Machine Learning
Knowledge Representation (Computer)
Pattern Recognition System
Two-Parameter Models
Computational Molecular Biology
Data Mining
Learning

Related Feeds

Bioinformatics in Biomedicine

Bioinformatics in biomedicine incorporates computer science, biology, chemistry, medicine, mathematics and statistics. Discover the latest research on bioinformatics in biomedicine here.

© 2021 Meta ULC. All rights reserved