Package | Description |
---|---|
org.tribuo.util.tokens.impl |
Simple fixed rule tokenizers.
|
org.tribuo.util.tokens.impl.wordpiece |
Provides an implementation of a Wordpiece tokenizer which implements
to the Tribuo
Tokenizer API. |
Modifier and Type | Class and Description |
---|---|
class |
SplitCharactersTokenizer
This implementation of
Tokenizer is instantiated with an array of
characters that are considered split characters. |
class |
WhitespaceTokenizer
A simple tokenizer that splits on whitespace.
|
Modifier and Type | Class and Description |
---|---|
class |
WordpieceBasicTokenizer
This is a tokenizer that is used "upstream" of
WordpieceTokenizer and
implements much of the functionality of the 'BasicTokenizer'
implementation in huggingface. |
Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.