Package org.tribuo.data.text.impl
package org.tribuo.data.text.impl
Provides implementations of text data processors.
-
ClassesClassDescriptionA feature aggregator that averages feature values across a feature list.An example implementation of
TextPipeline
.Hashes the feature names to reduce the dimensionality.A text processor that will generate token ngrams of a particular size.SimpleStringDataSource<T extends Output<T>>A version ofSimpleTextDataSource
that accepts anIterable
of Strings.Provenance forSimpleStringDataSource
.SimpleTextDataSource<T extends Output<T>>A dataset for a simple data format for text classification experiments.Provenance forSimpleTextDataSource
.A feature aggregator that aggregates occurrence counts across a number of feature lists.TextFeatureExtractorImpl<T extends Output<T>>A pipeline for generating ngram features.Aggregates feature tokens, generating unique features.