Skip to contents

Implementation of Permutation Feature Importance (PFI) using modular sampling approach. PFI measures the importance of a feature by calculating the increase in model error when the feature's values are randomly permuted, breaking the relationship between the feature and the target variable.

Details

Permutation Feature Importance was originally introduced by Breiman (2001) as part of the Random Forest algorithm. The method works by:

  1. Computing baseline model performance on the original dataset

  2. For each feature, randomly permuting its values while keeping other features unchanged

  3. Computing model performance on the permuted dataset

  4. Calculating importance as the difference (or ratio) between permuted and original performance

References

Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. doi:10.1023/A:1010933404324 . Fisher, Aaron, Rudin, Cynthia, Dominici, Francesca (2019). “All Models Are Wrong, but Many Are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously.” Journal of Machine Learning Research, 20, 177. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323609/. Strobl, Carolin, Boulesteix, Anne-Laure, Kneib, Thomas, Augustin, Thomas, Zeileis, Achim (2008). “Conditional Variable Importance for Random Forests.” BMC Bioinformatics, 9(1), 307. ISSN 1471-2105, doi:10.1186/1471-2105-9-307 .

Methods

Inherited methods


Method new()

Creates a new instance of the PFI class

Usage

PFI$new(
  task,
  learner,
  measure,
  resampling = NULL,
  features = NULL,
  relation = "difference",
  iters_perm = 1L
)

Arguments

task, learner, measure, resampling, features

Passed to PerturbationImportance

relation

(character(1)) How to relate perturbed scores to originals. Can be overridden in $compute().

iters_perm

(integer(1)) Number of permutation iterations. Can be overridden in $compute().


Method compute()

Compute PFI scores

Usage

PFI$compute(relation = NULL, iters_perm = NULL, store_backends = TRUE)

Arguments

relation

(character(1)) How to relate perturbed scores to originals. If NULL, uses stored value.

iters_perm

(integer(1)) Number of permutation iterations. If NULL, uses stored value.

store_backends

(logical(1)) Whether to store backends


Method clone()

The objects of this class are cloneable with this method.

Usage

PFI$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

library(mlr3learners)
task = tgen("xor", d = 5)$generate(n = 100)
pfi = PFI$new(
  task = task,
  learner = lrn("classif.ranger", num.trees = 50, predict_type = "prob"),
  measure = msr("classif.ce"),
  resampling = rsmp("cv", folds = 3),
  iters_perm = 3
)
pfi$compute()
#> Warning: Dropped unused factor level(s) in dependent variable: P.
#> Key: <feature>
#>    feature importance         sd
#>     <char>      <num>      <num>
#> 1:      x1 0.11309170 0.06806294
#> 2:      x2 0.07001386 0.04512373
#> 3:      x3 0.06625074 0.04849623
#> 4:      x4 0.07318281 0.03983978
#> 5:      x5 0.08001584 0.04774260