java.lang.Object
org.tribuo.classification.explanations.lime.LIMEBase
All Implemented Interfaces:
TabularExplainer<Regressor>
Direct Known Subclasses:
LIMEColumnar, LIMEText

public class LIMEBase extends Object implements TabularExplainer<Regressor>
LIMEBase merges the lime_base.py and lime_tabular.py implementations, and deals with simple matrices of numerical or categorical data. If you want a mixture of text, numerical and categorical data try LIMEColumnar. For plain text data use LIMEText.

See:

 Ribeiro MT, Singh S, Guestrin C.
 "Why should I trust you?: Explaining the predictions of any classifier"
 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016.
 
  • Field Details

    • WIDTH_CONSTANT

      public static final double WIDTH_CONSTANT
      Width of the noise gaussian.
      See Also:
    • DISTANCE_DELTA

      public static final double DISTANCE_DELTA
      Delta to consider two distances equal.
      See Also:
    • regressionFactory

      protected static final OutputFactory<Regressor> regressionFactory
    • evaluator

      protected static final RegressionEvaluator evaluator
    • rng

      protected final SplittableRandom rng
    • innerModel

      protected final Model<Label> innerModel
    • explanationTrainer

      protected final SparseTrainer<Regressor> explanationTrainer
    • numSamples

      protected final int numSamples
    • numTrainingExamples

      protected final long numTrainingExamples
    • kernelWidth

      protected final double kernelWidth
  • Constructor Details

    • LIMEBase

      public LIMEBase(SplittableRandom rng, Model<Label> innerModel, SparseTrainer<Regressor> explanationTrainer, int numSamples)
      Constructs a LIME explainer for a model which uses tabular data (i.e., no special treatment for text features).
      Parameters:
      rng - The rng to use for sampling.
      innerModel - The model to explain.
      explanationTrainer - The sparse trainer used to explain predictions.
      numSamples - The number of samples to generate for an explanation.
  • Method Details

    • explain

      public LIMEExplanation explain(Example<Label> example)
      Description copied from interface: TabularExplainer
      Explain why the supplied Example is classified a certain way.
      Specified by:
      explain in interface TabularExplainer<Regressor>
      Parameters:
      example - The Example to explain.
      Returns:
      An Explanation for this example.
    • explainWithSamples

      protected com.oracle.labs.mlrg.olcut.util.Pair<LIMEExplanation,List<Example<Regressor>>> explainWithSamples(Example<Label> example)
    • samplePoint

      public static Example<Label> samplePoint(Random rng, ImmutableFeatureMap fMap, long numTrainingExamples, SparseVector input)
      Samples a single example from the supplied feature map and input vector.
      Parameters:
      rng - The rng to use.
      fMap - The feature map describing the domain of the features.
      numTrainingExamples - The number of training examples the fMap has seen.
      input - The input sparse vector to use.
      Returns:
      An Example sampled from the supplied feature map and input vector.
    • trainExplainer

      protected SparseModel<Regressor> trainExplainer(Example<Regressor> target, List<Example<Regressor>> samples)
      Trains the explanation model using the supplied sampled data and the input example.

      The labels are usually the predicted probabilities from the real model.

      Parameters:
      target - The input example to explain.
      samples - The sampled data around the input.
      Returns:
      An explanation model.
    • kernelDist

      public static double kernelDist(double input, double width)
      Calculates an RBF kernel of a specific width.
      Parameters:
      input - The input value.
      width - The width of the kernel.
      Returns:
      sqrt ( exp ( - input*input / width))
    • measureDistance

      public static double measureDistance(ImmutableFeatureMap fMap, long numTrainingExamples, SparseVector input, SparseVector sample)
      Measures the distance between an input point and a sampled point.

      This distance function takes into account categorical and real values. It uses the hamming distance for categoricals and the euclidean distance for real values.

      Parameters:
      fMap - The feature map used to determine if a feature is categorical or real.
      numTrainingExamples - The number of training examples the fMap has seen.
      input - The input point.
      sample - The sampled point.
      Returns:
      The distance between the two points.
    • transformOutput

      public static Regressor transformOutput(Prediction<Label> prediction)
      Transforms a Prediction for a multiclass problem into a Regressor output which represents the probability for each class.

      Used as the target for LIME Models.

      Parameters:
      prediction - A multiclass prediction object. Must contain probabilities.
      Returns:
      The n dimensional probability output.