org.tribuo.classification.explanations.lime.LIMEBase

All Implemented Interfaces:: TabularExplainer<Regressor>

Direct Known Subclasses:: LIMEColumnar, LIMEText

public class LIMEBase extends Object implements TabularExplainer<Regressor>

LIMEBase merges the lime_base.py and lime_tabular.py implementations, and deals with simple matrices of numerical or categorical data. If you want a mixture of text, numerical and categorical data try LIMEColumnar. For plain text data use LIMEText.

See:

 Ribeiro MT, Singh S, Guestrin C.
 "Why should I trust you?: Explaining the predictions of any classifier"
 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016.

Field Summary

Fields

Modifier and Type

Field

Description

static final double

DISTANCE_DELTA

protected static final RegressionEvaluator

evaluator

protected final SparseTrainer<Regressor>

explanationTrainer

protected final Model<Label>

innerModel

protected final double

kernelWidth

protected final int

numSamples

protected final long

numTrainingExamples

protected static final OutputFactory<Regressor>

regressionFactory

protected final SplittableRandom

rng

static final double

WIDTH_CONSTANT
Constructor Summary

Constructors

Constructor

Description

LIMEBase(SplittableRandom rng, Model<Label> innerModel, SparseTrainer<Regressor> explanationTrainer, int numSamples)

Constructs a LIME explainer for a model which uses tabular data (i.e., no special treatment for text features).
Method Summary

Modifier and Type

Method

Description

LIMEExplanation

explain(Example<Label> example)

Explain why the supplied Example is classified a certain way.

protected com.oracle.labs.mlrg.olcut.util.Pair<LIMEExplanation, List<Example<Regressor>>>

explainWithSamples(Example<Label> example)

static double

kernelDist(double input, double width)

Calculates an RBF kernel of a specific width.

static double

measureDistance(ImmutableFeatureMap fMap, long numTrainingExamples, SparseVector input, SparseVector sample)

Measures the distance between an input point and a sampled point.

static Example<Label>

samplePoint(Random rng, ImmutableFeatureMap fMap, long numTrainingExamples, SparseVector input)

Samples a single example from the supplied feature map and input vector.

protected SparseModel<Regressor>

trainExplainer(Example<Regressor> target, List<Example<Regressor>> samples)

Trains the explanation model using the supplied sampled data and the input example.

static Regressor

transformOutput(Prediction<Label> prediction)

Transforms a Prediction for a multiclass problem into a Regressor output which represents the probability for each class.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- WIDTH_CONSTANT
  public static final double WIDTH_CONSTANT
  
  See Also:
  
  Constant Field Values
- DISTANCE_DELTA
  public static final double DISTANCE_DELTA
  
  See Also:
  
  Constant Field Values
- regressionFactory
  
  protected static final OutputFactory<Regressor> regressionFactory
- evaluator
  
  protected static final RegressionEvaluator evaluator
- rng
  
  protected final SplittableRandom rng
- innerModel
  
  protected final Model<Label> innerModel
- explanationTrainer
  
  protected final SparseTrainer<Regressor> explanationTrainer
- numSamples
  
  protected final int numSamples
- numTrainingExamples
  
  protected final long numTrainingExamples
- kernelWidth
  
  protected final double kernelWidth
Constructor Details
- LIMEBase
  
  public LIMEBase(SplittableRandom rng, Model<Label> innerModel, SparseTrainer<Regressor> explanationTrainer, int numSamples)
  
  Constructs a LIME explainer for a model which uses tabular data (i.e., no special treatment for text features).
  
  Parameters:
  
  rng - The rng to use for sampling.
  
  innerModel - The model to explain.
  
  explanationTrainer - The sparse trainer used to explain predictions.
  
  numSamples - The number of samples to generate for an explanation.
Method Details
- explain
  
  public LIMEExplanation explain(Example<Label> example)
  
  Description copied from interface: TabularExplainer
  
  Explain why the supplied Example is classified a certain way.
  
  Specified by:
  
  explain in interface TabularExplainer<Regressor>
  
  Parameters:
  
  example - The Example to explain.
  
  Returns:
  
  An Explanation for this example.
- explainWithSamples
  
  protected com.oracle.labs.mlrg.olcut.util.Pair<LIMEExplanation, List<Example<Regressor>>> explainWithSamples(Example<Label> example)
- samplePoint
  
  public static Example<Label> samplePoint(Random rng, ImmutableFeatureMap fMap, long numTrainingExamples, SparseVector input)
  
  Samples a single example from the supplied feature map and input vector.
  
  Parameters:
  
  rng - The rng to use.
  
  fMap - The feature map describing the domain of the features.
  
  numTrainingExamples - The number of training examples the fMap has seen.
  
  input - The input sparse vector to use.
  
  Returns:
  
  An Example sampled from the supplied feature map and input vector.
- trainExplainer
  
  protected SparseModel<Regressor> trainExplainer(Example<Regressor> target, List<Example<Regressor>> samples)
  
  Trains the explanation model using the supplied sampled data and the input example.
  The labels are usually the predicted probabilities from the real model.
  
  Parameters:
  
  target - The input example to explain.
  
  samples - The sampled data around the input.
  
  Returns:
  
  An explanation model.
- kernelDist
  
  public static double kernelDist(double input, double width)
  
  Calculates an RBF kernel of a specific width.
  
  Parameters:
  
  input - The input value.
  
  width - The width of the kernel.
  
  Returns:
  
  sqrt ( exp ( - input*input / width))
- measureDistance
  
  public static double measureDistance(ImmutableFeatureMap fMap, long numTrainingExamples, SparseVector input, SparseVector sample)
  
  Measures the distance between an input point and a sampled point.
  This distance function takes into account categorical and real values. It uses the hamming distance for categoricals and the euclidean distance for real values.
  
  Parameters:
  
  fMap - The feature map used to determine if a feature is categorical or real.
  
  numTrainingExamples - The number of training examples the fMap has seen.
  
  input - The input point.
  
  sample - The sampled point.
  
  Returns:
  
  The distance between the two points.
- transformOutput
  
  public static Regressor transformOutput(Prediction<Label> prediction)
  
  Transforms a Prediction for a multiclass problem into a Regressor output which represents the probability for each class.
  Used as the target for LIME Models.
  
  Parameters:
  
  prediction - A multiclass prediction object. Must contain probabilities.
  
  Returns:
  
  The n dimensional probability output.

Class LIMEBase

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

WIDTH_CONSTANT

DISTANCE_DELTA

regressionFactory

evaluator

rng

innerModel

explanationTrainer

numSamples

numTrainingExamples

kernelWidth

Constructor Details

LIMEBase

Method Details

explain

explainWithSamples

samplePoint

trainExplainer

kernelDist

measureDistance

transformOutput