Package org.tribuo.util.tokens.universal
package org.tribuo.util.tokens.universal
An implementation of a "universal" tokenizer which will split
 on word boundaries or character boundaries for languages where
 word boundaries are contextual.
 
It was originally developed to support information retrieval and forms a useful baseline tokenizer for generating features for machine learning.
- 
ClassesClassDescriptionA range currently being segmented.This class was originally written for the purpose of document indexing in an information retrieval context (principally used in Sun Labs' Minion search engine).