public class SimpleTextDataSource<T extends Output<T>> extends TextDataSource<T>
OUTPUT##Document textEach line in the file specifies a single output and document pair. Leading and trailing spaces will be trimmed from outputs and documents. Outputs will be converted to upper case.
As with all of our text data, the file should be in UTF-8.
Modifier and Type | Class and Description |
---|---|
static class |
SimpleTextDataSource.SimpleTextDataSourceProvenance
Provenance for
SimpleTextDataSource . |
Modifier and Type | Field and Description |
---|---|
protected ConfiguredDataSourceProvenance |
provenance |
data, extractor, outputFactory, path, preprocessors
Modifier | Constructor and Description |
---|---|
protected |
SimpleTextDataSource()
for olcut
|
|
SimpleTextDataSource(File file,
OutputFactory<T> outputFactory,
TextFeatureExtractor<T> extractor) |
protected |
SimpleTextDataSource(OutputFactory<T> outputFactory,
TextFeatureExtractor<T> extractor) |
|
SimpleTextDataSource(Path path,
OutputFactory<T> outputFactory,
TextFeatureExtractor<T> extractor) |
Modifier and Type | Method and Description |
---|---|
protected ConfiguredDataSourceProvenance |
cacheProvenance() |
ConfiguredDataSourceProvenance |
getProvenance() |
protected Optional<Example<T>> |
parseLine(String line,
int n) |
void |
postConfig()
Used by the OLCUT configuration system, and should not be called by external code.
|
protected void |
read()
Reads the data from the Path.
|
getOutputFactory, handleDoc, iterator, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
forEach, spliterator
protected ConfiguredDataSourceProvenance provenance
protected SimpleTextDataSource()
public SimpleTextDataSource(Path path, OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor) throws IOException
IOException
public SimpleTextDataSource(File file, OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor) throws IOException
IOException
protected SimpleTextDataSource(OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor)
public void postConfig() throws IOException
IOException
protected void read() throws IOException
TextDataSource
read
in class TextDataSource<T extends Output<T>>
IOException
- if there is any error reading the data.public ConfiguredDataSourceProvenance getProvenance()
protected ConfiguredDataSourceProvenance cacheProvenance()
Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.