Uses of Interface
org.tribuo.data.text.DocumentPreprocessor
Package
Description
Provides implementations of text data processors.
-
Uses of DocumentPreprocessor in org.tribuo.data.text
Modifier and TypeFieldDescriptionprotected List<DocumentPreprocessor>
DirectoryFileSource.preprocessors
Document preprocessors that should be run on the documents that make up this data set.protected List<DocumentPreprocessor>
TextDataSource.preprocessors
Document preprocessors that should be run on the documents that make up this data set.ModifierConstructorDescriptionDirectoryFileSource
(Path dataDir, OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor, DocumentPreprocessor... preprocessors) Creates a data source that will use the given feature extractor and document preprocessors on the data read from the files in the directories representing classes.TextDataSource
(File file, OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor, DocumentPreprocessor... preprocessors) Creates a text data set by reading it from a file.TextDataSource
(Path path, OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor, DocumentPreprocessor... preprocessors) Creates a text data set by reading it from a path. -
Uses of DocumentPreprocessor in org.tribuo.data.text.impl
Modifier and TypeClassDescriptionclass
A document preprocessor which uppercases or lowercases the input.class
A document pre-processor for 20 newsgroup data.final class
A simple document preprocessor which applies regular expressions to the input.