Package org.tribuo.util.tokens.impl.wordpiece
package org.tribuo.util.tokens.impl.wordpiece
Provides an implementation of a Wordpiece tokenizer which implements
 to the Tribuo 
Tokenizer API.
- 
ClassesClassDescriptionThis is vanilla implementation of the Wordpiece algorithm as found here: https://github.com/huggingface/transformers/blob/master/src/transformers/models/bert/tokenization_bert.pyThis is a tokenizer that is used "upstream" ofWordpieceTokenizerand implements much of the functionality of the 'BasicTokenizer' implementation in huggingface.This Tokenizer is meant to be a reasonable approximation of the BertTokenizer defined here.