Feature Selection

Feature Selection

Journal of Machine Learning Research 3 (2003) 1157-1182

Submitted 11/02; Published 3/03

An Introduction to Variable and Feature Selection
Isabelle Guyon

ISABELLE @ CLOPINET. COM

Clopinet
955 Creston Road
Berkeley, CA 94708-1501, USA

Andr´e Elisseeff

ANDRE @ TUEBINGEN . MPG . DE

Empirical Inference for Machine Learning and Perception Department
Max Planck Institute for Biological Cybernetics
Spemannstrasse 38
72076 T¨ubingen, Germany

Editor: Leslie Pack Kaelbling

Abstract
Variable and feature selection have become the focus of much research in areas of application for
which datasets with tens or hundreds of thousands of variables are available. These areas include
text processing of internet documents, gene expression array analysis, and combinatorial chemistry.
The objective of variable selection is three-fold: improving the prediction performance of the predictors, providing faster and more cost-effective predictors, and providing a better understanding of
the underlying process that generated the data. The contributions of this special issue cover a wide
range of aspects of such problems: providing a better definition of the objective function, feature
construction, feature ranking, multivariate feature selection, efficient search methods, and feature
validity assessment methods.
Keywords: Variable selection, feature selection, space dimensionality reduction, pattern discovery, filters, wrappers, clustering, information theory, support vector machines, model selection,
statistical testing, bioinformatics, computational biology, gene expression, microarray, genomics,
proteomics, QSAR, text classification, information retrieval.

1 Introduction
As of 1997, when a special issue on relevance including several papers on variable and feature
selection was published (Blum and Langley, 1997, Kohavi and John, 1997), few domains explored
used more than 40 features. The situation has changed considerably in the past few...

Similar Essays