Package org.tribuo.util.tokens.options
Class SplitCharactersTokenizerOptions
java.lang.Object
org.tribuo.util.tokens.options.SplitCharactersTokenizerOptions
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Options
,TokenizerOptions
CLI options for a
SplitCharactersTokenizer
.-
Field Summary
Modifier and TypeFieldDescriptionchar[]
The characters to split on.char[]
Characters to split on unless they appear between digitsFields inherited from interface com.oracle.labs.mlrg.olcut.config.Options
header
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionCreates the appropriately configured tokenizer.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface com.oracle.labs.mlrg.olcut.config.Options
getOptionsDescription
-
Field Details
-
splitChars
@Option(longName="sc-tokenizer-split-characters", usage="The characters to split on.") public char[] splitCharsThe characters to split on. -
splitXDigitsChars
@Option(longName="sc-tokenizer-split-x-digits", usage="Characters to split on unless they appear between digits") public char[] splitXDigitsCharsCharacters to split on unless they appear between digits
-
-
Constructor Details
-
SplitCharactersTokenizerOptions
public SplitCharactersTokenizerOptions()
-
-
Method Details
-
getTokenizer
Description copied from interface:TokenizerOptions
Creates the appropriately configured tokenizer.- Specified by:
getTokenizer
in interfaceTokenizerOptions
- Returns:
- The configured tokenizer.
-