All Implemented Interfaces:
TabularExplainer<Regressor>, TextExplainer<Regressor>

public class LIMEText extends LIMEBase implements TextExplainer<Regressor>
Uses a Tribuo TextFeatureExtractor to explain the prediction for a given piece of text.

LIME uses a naive sampling procedure which blanks out words and trains the linear model on a set of binary features, each of which is the presence of a word+position combination. Words are not permuted, and new words are not added (so it's only explaining when the absence of a word would change the prediction).


 Ribeiro MT, Singh S, Guestrin C.
 "Why should I trust you?: Explaining the predictions of any classifier"
 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016.
  • Constructor Details

    • LIMEText

      public LIMEText(SplittableRandom rng, Model<Label> innerModel, SparseTrainer<Regressor> explanationTrainer, int numSamples, TextFeatureExtractor<Label> extractor, Tokenizer tokenizer)
      Constructs a LIME explainer for a model which uses text data.
      rng - The rng to use for sampling.
      innerModel - The model to explain.
      explanationTrainer - The sparse trainer to use to generate explanations.
      numSamples - The number of samples to generate for each explanation.
      extractor - The TextFeatureExtractor used to generate text features from a string.
      tokenizer - The tokenizer used to tokenize the examples.
  • Method Details

    • explain

      public LIMEExplanation explain(String inputText)
      Description copied from interface: TextExplainer
      Converts the supplied text into an Example, and generates an explanation of the contained Model's prediction.
      Specified by:
      explain in interface TextExplainer<Regressor>
      inputText - The text to explain.
      An explanation of the prediction on this input text.
    • nameFeature

      protected String nameFeature(String name, int idx)
      Generate the feature name by combining the word and index.
      name - The word.
      idx - The index.
      A string representing both of the inputs.
    • sampleData

      protected List<Example<Regressor>> sampleData(String inputText, List<Token> tokens)
      Samples a new dataset from the input text. Uses the tokenized representation, removes words by blanking them out. Only removes words to generate a new sentence, and does not generate the empty sentence.
      inputText - The input text.
      tokens - The tokenized representation of the input text.
      A list of samples from the input text.