Package org.tribuo.data.text.impl
Class SimpleStringDataSource<T extends Output<T>>
java.lang.Object
org.tribuo.data.text.TextDataSource<T>
org.tribuo.data.text.impl.SimpleTextDataSource<T>
org.tribuo.data.text.impl.SimpleStringDataSource<T>
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable
,com.oracle.labs.mlrg.olcut.provenance.Provenancable<DataSourceProvenance>
,Iterable<Example<T>>
,ConfigurableDataSource<T>
,DataSource<T>
A version of
SimpleTextDataSource
that accepts a List
of Strings.
Uses the parsing logic from SimpleTextDataSource
.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.tribuo.data.text.impl.SimpleTextDataSource
SimpleTextDataSource.SimpleTextDataSourceProvenance
-
Field Summary
Modifier and TypeFieldDescriptionUsed because OLCUT doesn't support generic Iterables.Fields inherited from class org.tribuo.data.text.impl.SimpleTextDataSource
provenance
Fields inherited from class org.tribuo.data.text.TextDataSource
data, extractor, outputFactory, path, preprocessors
-
Constructor Summary
ConstructorDescriptionSimpleStringDataSource
(List<String> rawLines, OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor) Constructs a simple string data source from the supplied lines. -
Method Summary
Modifier and TypeMethodDescriptionprotected ConfiguredDataSourceProvenance
void
Used by the OLCUT configuration system, and should not be called by external code.protected void
read()
Reads the data from the Path.toString()
Methods inherited from class org.tribuo.data.text.impl.SimpleTextDataSource
getProvenance, parseLine
Methods inherited from class org.tribuo.data.text.TextDataSource
getOutputFactory, handleDoc, iterator
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Field Details
-
rawLines
Used because OLCUT doesn't support generic Iterables.
-
-
Constructor Details
-
SimpleStringDataSource
public SimpleStringDataSource(List<String> rawLines, OutputFactory<T> outputFactory, TextFeatureExtractor<T> extractor) Constructs a simple string data source from the supplied lines.- Parameters:
rawLines
- The lines to parse.outputFactory
- The output factory.extractor
- The feature extractor.
-
-
Method Details
-
postConfig
public void postConfig()Used by the OLCUT configuration system, and should not be called by external code.- Specified by:
postConfig
in interfacecom.oracle.labs.mlrg.olcut.config.Configurable
- Overrides:
postConfig
in classSimpleTextDataSource<T extends Output<T>>
-
toString
- Overrides:
toString
in classTextDataSource<T extends Output<T>>
-
read
protected void read()Description copied from class:TextDataSource
Reads the data from the Path.- Overrides:
read
in classSimpleTextDataSource<T extends Output<T>>
-
cacheProvenance
- Overrides:
cacheProvenance
in classSimpleTextDataSource<T extends Output<T>>
-