Uses of Interface
org.tribuo.util.tokens.Tokenizer
Packages that use Tokenizer
Package
Description
Provides an implementation of LIME (Locally Interpretable Model Explanations).
Provides implementations of text data processors.
Core definitions for tokenization.
Simple fixed rule tokenizers.
OLCUT
Options
implementations
which can construct Tokenizer
s of various types.-
Uses of Tokenizer in org.tribuo.classification.explanations.lime
Constructors in org.tribuo.classification.explanations.lime with parameters of type TokenizerModifierConstructorDescriptionLIMEColumnar
(SplittableRandom rng, Model<Label> innerModel, SparseTrainer<Regressor> explanationTrainer, int numSamples, RowProcessor<Label> exampleGenerator, Tokenizer tokenizer) Constructs a LIME explainer for a model which uses the columnar data processing system.LIMEText
(SplittableRandom rng, Model<Label> innerModel, SparseTrainer<Regressor> explanationTrainer, int numSamples, TextFeatureExtractor<Label> extractor, Tokenizer tokenizer) Constructs a LIME explainer for a model which uses text data. -
Uses of Tokenizer in org.tribuo.data.text.impl
Constructors in org.tribuo.data.text.impl with parameters of type TokenizerModifierConstructorDescriptionBasicPipeline
(Tokenizer tokenizer, int ngram) NgramProcessor
(Tokenizer tokenizer, int n, double value) Creates a processor that will generate token ngrams of sizen
.TokenPipeline
(Tokenizer tokenizer, int ngram, boolean termCounting) Creates a new token pipeline.TokenPipeline
(Tokenizer tokenizer, int ngram, boolean termCounting, int dimension) Creates a new token pipeline. -
Uses of Tokenizer in org.tribuo.util.tokens
Methods in org.tribuo.util.tokens that return TokenizerMethods in org.tribuo.util.tokens that return types with arguments of type TokenizerModifier and TypeMethodDescriptionTokenizer.createSupplier
(Tokenizer tokenizer) static ThreadLocal
<Tokenizer> Tokenizer.createThreadLocal
(Tokenizer tokenizer) Methods in org.tribuo.util.tokens with parameters of type TokenizerModifier and TypeMethodDescriptionTokenizer.createSupplier
(Tokenizer tokenizer) static ThreadLocal
<Tokenizer> Tokenizer.createThreadLocal
(Tokenizer tokenizer) -
Uses of Tokenizer in org.tribuo.util.tokens.impl
Classes in org.tribuo.util.tokens.impl that implement TokenizerModifier and TypeClassDescriptionclass
A tokenizer wrapping aBreakIterator
instance.class
A convenience class for when you are required to provide a tokenizer but you don't actually want to split up the text into tokens.class
This tokenizer is loosely based on the notion of word shape which is a common feature used in NLP.class
This implementation ofTokenizer
is instantiated with an array of characters that are considered split characters.class
This implementation ofTokenizer
is instantiated with a regular expression pattern which determines how to split a string into tokens. -
Uses of Tokenizer in org.tribuo.util.tokens.options
Methods in org.tribuo.util.tokens.options that return TokenizerModifier and TypeMethodDescriptionBreakIteratorTokenizerOptions.getTokenizer()
CoreTokenizerOptions.getTokenizer()
SplitCharactersTokenizerOptions.getTokenizer()
SplitPatternTokenizerOptions.getTokenizer()
TokenizerOptions.getTokenizer()
Creates the appropriately configured tokenizer. -
Uses of Tokenizer in org.tribuo.util.tokens.universal
Classes in org.tribuo.util.tokens.universal that implement TokenizerModifier and TypeClassDescriptionclass
This class was originally written for the purpose of document indexing in an information retrieval context (principally used in Sun Labs' Minion search engine).Methods in org.tribuo.util.tokens.universal that return Tokenizer