You are here: Home » Development » Documentation » Components » Fast Correlation-Based Filter

Fast Correlation-Based Filter

Contact: Haralambos Sarimveis

Categories: Feature selection

Exposed methods:

Feature selection

Input format: Weka's ARFF format
Output format: Weka's ARFF format
User-specified parameters: A predefined threshold
Reporting information: The optimal subset of variables


The FCBF (Fast Correlation-Based Filter) algorithm consists of two stages: the first one is a relevance analysis,
aimed at ordering the input variables depending on a relevance score, which is computed as the symmetric
uncertainty with respect to the target output. This stage is also used to discard irrelevant variables, which are
those whose ranking score is below a predefined threshold. The second stage is a redundancy analysis, aimed
at selecting predominant features from the relevant set obtained in the first stage. This selection is an iterative
process that removes those variables which form an approximate Markov blanket. The method is described in
details in [YUL04].
More information can be found in the following Web page:

Background (publication date, popularity/level of familiarity, rationale of approach, further comments)
Widely used standard feature selection method, disadvantage: the input variables
should be discretized

Class-blind/class-sensitive feature selection
Class-sensitive feature selection

Type (optimal, greedy, randomized)

Filter/wrapper/hybrid approach

Type of Descriptor:


Priority: Medium

Development status:


External components: WEKA

Technical details

Data: No

Software: Yes

Programming language(s): Java

Operating system(s): Linux, Win, Mac OS

Input format: Weka's ARFF format

Output format: Weka's ARFF format

License: GPL


[YUL04] Yu, L., Liu, H. (2004). Efficient Feature Selection via Analysis of Relevance and Redundancy, Journal of Chemical Machine Learning Research 5:1205-1224.

Document Actions