Implementation of Permutation Feature Importance (PFI) using modular sampling approach. PFI measures the importance of a feature by calculating the increase in model error when the feature's values are randomly permuted, breaking the relationship between the feature and the target variable.
Details
Permutation Feature Importance was originally introduced by Breiman (2001) as part of the Random Forest algorithm. The method works by:
Computing baseline model performance on the original dataset
For each feature, randomly permuting its values while keeping other features unchanged
Computing model performance on the permuted dataset
Calculating importance as the difference (or ratio) between permuted and original performance
References
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. doi:10.1023/A:1010933404324 . Fisher, Aaron, Rudin, Cynthia, Dominici, Francesca (2019). “All Models Are Wrong, but Many Are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously.” Journal of Machine Learning Research, 20, 177. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323609/. Strobl, Carolin, Boulesteix, Anne-Laure, Kneib, Thomas, Augustin, Thomas, Zeileis, Achim (2008). “Conditional Variable Importance for Random Forests.” BMC Bioinformatics, 9(1), 307. ISSN 1471-2105, doi:10.1186/1471-2105-9-307 .
Super classes
xplainfi::FeatureImportanceMethod
-> xplainfi::PerturbationImportance
-> PFI
Methods
Method new()
Creates a new instance of the PFI class
Usage
PFI$new(
task,
learner,
measure,
resampling = NULL,
features = NULL,
relation = "difference",
iters_perm = 1L
)
Examples
library(mlr3learners)
task = tgen("xor", d = 5)$generate(n = 100)
pfi = PFI$new(
task = task,
learner = lrn("classif.ranger", num.trees = 50, predict_type = "prob"),
measure = msr("classif.ce"),
resampling = rsmp("cv", folds = 3),
iters_perm = 3
)
pfi$compute()
#> Warning: Dropped unused factor level(s) in dependent variable: P.
#> Key: <feature>
#> feature importance sd
#> <char> <num> <num>
#> 1: x1 0.11309170 0.06806294
#> 2: x2 0.07001386 0.04512373
#> 3: x3 0.06625074 0.04849623
#> 4: x4 0.07318281 0.03983978
#> 5: x5 0.08001584 0.04774260