public class BreakIteratorTokenizer extends Object implements Tokenizer
BreakIterator
instance.Constructor and Description |
---|
BreakIteratorTokenizer(Locale locale) |
Modifier and Type | Method and Description |
---|---|
boolean |
advance()
Advances the tokenizer to the next token.
|
BreakIteratorTokenizer |
clone()
Clones a tokenizer with it's configuration.
|
int |
getEnd()
Gets the ending offset (exclusive) of the current token in the character
sequence
|
String |
getLanguageTag()
Returns the locale string this tokenizer uses.
|
com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance |
getProvenance() |
int |
getStart()
Gets the starting character offset of the current token in the character
sequence
|
String |
getText()
Gets the text of the current token, as a string
|
Token.TokenType |
getType()
Gets the type of the current token.
|
void |
postConfig()
Used by the OLCUT configuration system, and should not be called by external code.
|
void |
reset(CharSequence cs)
Resets the tokenizer so that it operates on a new sequence of characters.
|
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
createSupplier, createThreadLocal, getToken, split, tokenize
public BreakIteratorTokenizer(Locale locale)
public void postConfig()
postConfig
in interface com.oracle.labs.mlrg.olcut.config.Configurable
public String getLanguageTag()
public com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance getProvenance()
getProvenance
in interface com.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>
public void reset(CharSequence cs)
Tokenizer
public boolean advance()
Tokenizer
public String getText()
Tokenizer
public int getStart()
Tokenizer
public int getEnd()
Tokenizer
public Token.TokenType getType()
Tokenizer
public BreakIteratorTokenizer clone()
Tokenizer
Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.