An interface for things that can pre-process documents before they are broken into features.
An interface for aggregating feature values into other values.
A feature transformer maps a list of features to a new list of features Useful for example to apply the hashing trick to a set of features
|TextFeatureExtractor<T extends Output<T>>||
An interface for things that take text and turn them into examples that we can use to train or evaluate a classifier.
A pipeline that takes a String and returns a List of
A TextProcessor takes some text and optionally a feature tag and generates a list of
|DirectoryFileSource<T extends Output<T>>||
A data source for a somewhat-common format for text classification datasets: a top level directory that contains a number of subdirectories.
Splits data in our standard text format into training and testing portions.
Command line options.
|TextDataSource<T extends Output<T>>||
A base class for textual data sets.
An exception thrown by the text processing system.
Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.