Skip to contents

Calculates Leave-One-Covariate-In (LOCI) scores. This method is primariliy implemented for completeness' sake, but in general there are more informative measures for the marginal association between a feature and the target.

Details

LOCI measures feature importance by training models with only each individual feature and comparing their performance to a featureless baseline model (optimal constant prediction). The importance is calculated as (featureless_model_loss - single_feature_loss). Positive values indicate the feature performs better than the baseline, negative values indicate worse performance.

Methods

Public methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage

LOCI$new(
  task,
  learner,
  measure,
  resampling = NULL,
  features = NULL,
  iters_refit = 1L,
  obs_loss = FALSE
)

Arguments

task

(mlr3::Task) Task to compute importance for.

learner

(mlr3::Learner) Learner to use for prediction.

measure

(mlr3::Measure) Measure to use for scoring.

resampling

(mlr3::Resampling) Resampling strategy. Defaults to holdout.

features

(character()) Features to compute importance for. Defaults to all features.

iters_refit

(integer(1)) Number of refit iterations per resampling iteration.

obs_loss

(logical(1)) Whether to use observation-wise loss calculation (analogous to LOCO) when supported by the measure. If FALSE (default), uses aggregated scores. When TRUE, uses the measure's aggregation function or mean as fallback.


Method clone()

The objects of this class are cloneable with this method.

Usage

LOCI$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

library(mlr3)
library(mlr3learners)
task = tgen("friedman1")$generate(n = 200)

# Standard LOCI with aggregated scores
loci = LOCI$new(
  task = task,
  learner = lrn("regr.ranger", num.trees = 50),
  measure = msr("regr.mse")
)
#>  No <Resampling> provided
#> Using `resampling = rsmp("holdout")` with default `ratio = 0.67`.
loci$compute()

# Using observation-wise losses with measure's aggregation function
loci_obsloss = LOCI$new(
  task = task,
  learner = lrn("regr.ranger", num.trees = 50),
  measure = msr("regr.mae"), # uses MAE's aggregation function (mean) internally
  obs_loss = TRUE
)
#>  No <Resampling> provided
#> Using `resampling = rsmp("holdout")` with default `ratio = 0.67`.
loci_obsloss$compute()
loci_obsloss$obs_losses
#>      row_ids      feature iteration iter_refit    truth response_ref
#>        <int>       <char>     <int>      <int>    <num>        <num>
#>   1:       4   important1         1          1 15.36222     14.04979
#>   2:       4   important2         1          1 15.36222     14.04979
#>   3:       4   important3         1          1 15.36222     14.04979
#>   4:       4   important4         1          1 15.36222     14.04979
#>   5:       4   important5         1          1 15.36222     14.04979
#>  ---                                                                
#> 666:     190 unimportant1         1          1 11.60406     14.04979
#> 667:     190 unimportant2         1          1 11.60406     14.04979
#> 668:     190 unimportant3         1          1 11.60406     14.04979
#> 669:     190 unimportant4         1          1 11.60406     14.04979
#> 670:     190 unimportant5         1          1 11.60406     14.04979
#>      response_feature loss_ref loss_feature   obs_diff
#>                 <num>    <num>        <num>      <num>
#>   1:         15.42979 1.312421   0.06757879  1.2448426
#>   2:         12.11727 1.312421   3.24494453 -1.9325232
#>   3:         20.14413 1.312421   4.78191717 -3.4694958
#>   4:         11.69276 1.312421   3.66945606 -2.3570347
#>   5:         13.48251 1.312421   1.87971032 -0.5672890
#>  ---                                                  
#> 666:         13.96568 2.445737   2.36161915  0.0841178
#> 667:         18.78093 2.445737   7.17687679 -4.7311398
#> 668:         13.16344 2.445737   1.55938273  0.8863542
#> 669:         15.96397 2.445737   4.35991736 -1.9141804
#> 670:         12.86600 2.445737   1.26193892  1.1837980

# LOCI with median aggregation (analogous to original LOCO)
mae_median = msr("regr.mae")
mae_median$aggregator = median
loci_median = LOCI$new(
  task = task,
  learner = lrn("regr.ranger", num.trees = 50),
  measure = mae_median,
  obs_loss = TRUE
)
#>  No <Resampling> provided
#> Using `resampling = rsmp("holdout")` with default `ratio = 0.67`.
loci_median$compute()