Class NonTokenizer
java.lang.Object
org.tribuo.util.tokens.impl.NonTokenizer
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable,com.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>,Cloneable,Tokenizer
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbooleanadvance()Advances the tokenizer to the next token.clone()Clones a tokenizer with it's configuration.intgetEnd()Gets the ending offset (exclusive) of the current token in the character sequencecom.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenanceintgetStart()Gets the starting character offset of the current token in the character sequencegetText()Gets the text of the current token, as a stringgetType()Gets the type of the current token.voidreset(CharSequence cs) Resets the tokenizer so that it operates on a new sequence of characters.Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface com.oracle.labs.mlrg.olcut.config.Configurable
postConfig
-
Constructor Details
-
NonTokenizer
public NonTokenizer()
-
-
Method Details
-
getProvenance
public com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance getProvenance()- Specified by:
getProvenancein interfacecom.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>
-
reset
Description copied from interface:TokenizerResets the tokenizer so that it operates on a new sequence of characters. -
advance
-
getText
-
getStart
-
getEnd
-
getType
Description copied from interface:TokenizerGets the type of the current token. -
clone
Description copied from interface:TokenizerClones a tokenizer with it's configuration. Cloned tokenizers are not processing the same text as the original tokenizer and need to be reset with a fresh CharSequence.
-