Package | Description |
---|---|
org.tribuo.util.tokens.impl.wordpiece |
Provides an implementation of a Wordpiece tokenizer which implements
to the Tribuo
Tokenizer API. |
Class and Description |
---|
Wordpiece
This is vanilla implementation of the Wordpiece algorithm as found here:
https://github.com/huggingface/transformers/blob/master/src/transformers/models/bert/tokenization_bert.py
|
WordpieceBasicTokenizer
This is a tokenizer that is used "upstream" of
WordpieceTokenizer and
implements much of the functionality of the 'BasicTokenizer'
implementation in huggingface. |
WordpieceTokenizer
This Tokenizer is meant to be a reasonable approximation of the BertTokenizer
defined here.
|
Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.