Uses of Package
org.tribuo.util.tokens.impl
Packages that use org.tribuo.util.tokens.impl
Package
Description
Simple fixed rule tokenizers.
Provides an implementation of a Wordpiece tokenizer which implements
 to the Tribuo 
Tokenizer API.- 
Classes in org.tribuo.util.tokens.impl used by org.tribuo.util.tokens.implClassDescriptionA tokenizer wrapping aBreakIteratorinstance.A convenience class for when you are required to provide a tokenizer but you don't actually want to split up the text into tokens.This tokenizer is loosely based on the notion of word shape which is a common feature used in NLP.This implementation ofTokenizeris instantiated with an array of characters that are considered split characters.This class supports character-by-character (that is, codepoint-by-codepoint) iteration over input text to create tokens.An interface for checking if the text should be split at the supplied codepoint.A combination of aSplitFunctionTokenizer.SplitTypeand aToken.TokenType.Defines different ways that a tokenizer can split the input text at a given character.This implementation ofTokenizeris instantiated with a regular expression pattern which determines how to split a string into tokens.A simple tokenizer that splits on whitespace.
- 
Classes in org.tribuo.util.tokens.impl used by org.tribuo.util.tokens.impl.wordpieceClassDescriptionThis class supports character-by-character (that is, codepoint-by-codepoint) iteration over input text to create tokens.An interface for checking if the text should be split at the supplied codepoint.