Uses of Package org.tribuo.util.tokens.impl (Tribuo 4.1.1 API)

Packages that use org.tribuo.util.tokens.impl

Package

Description

org.tribuo.util.tokens.impl

Simple fixed rule tokenizers.

org.tribuo.util.tokens.impl.wordpiece

Provides an implementation of a Wordpiece tokenizer which implements to the Tribuo Tokenizer API.

Classes in org.tribuo.util.tokens.impl used by org.tribuo.util.tokens.impl

Class

Description

BreakIteratorTokenizer

A tokenizer wrapping a BreakIterator instance.

NonTokenizer

A convenience class for when you are required to provide a tokenizer but you don't actually want to split up the text into tokens.

ShapeTokenizer

This tokenizer is loosely based on the notion of word shape which is a common feature used in NLP.

SplitCharactersTokenizer

This implementation of Tokenizer is instantiated with an array of characters that are considered split characters.

SplitFunctionTokenizer

This class supports character-by-character (that is, codepoint-by-codepoint) iteration over input text to create tokens.

SplitFunctionTokenizer.SplitFunction

An interface for checking if the text should be split at the supplied codepoint.

SplitFunctionTokenizer.SplitResult

A combination of a SplitFunctionTokenizer.SplitType and a Token.TokenType.

SplitFunctionTokenizer.SplitType

Defines different ways that a tokenizer can split the input text at a given character.

SplitPatternTokenizer

This implementation of Tokenizer is instantiated with a regular expression pattern which determines how to split a string into tokens.

WhitespaceTokenizer

A simple tokenizer that splits on whitespace.
Classes in org.tribuo.util.tokens.impl used by org.tribuo.util.tokens.impl.wordpiece

Class

Description

SplitFunctionTokenizer

This class supports character-by-character (that is, codepoint-by-codepoint) iteration over input text to create tokens.

SplitFunctionTokenizer.SplitFunction

An interface for checking if the text should be split at the supplied codepoint.

Uses of Packageorg.tribuo.util.tokens.impl

Uses of Package
org.tribuo.util.tokens.impl