Class BasicPipeline
java.lang.Object
org.tribuo.data.text.impl.BasicPipeline
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable,com.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>,TextPipeline
An example implementation of
TextPipeline. Generates unique ngrams.-
Constructor Summary
ConstructorsConstructorDescriptionBasicPipeline(Tokenizer tokenizer, int ngram) Constructs a basic text pipeline which tokenizes the input and generates word n-gram features in the range 1 tongram. -
Method Summary
Modifier and TypeMethodDescriptioncom.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenancevoidUsed by the OLCUT configuration system, and should not be called by external code.Extracts a list of features from the supplied text, using the tag to prepend the feature names.toString()
-
Constructor Details
-
BasicPipeline
Constructs a basic text pipeline which tokenizes the input and generates word n-gram features in the range 1 tongram.- Parameters:
tokenizer- The tokenizer.ngram- The size of the n-grams to generate.
-
-
Method Details
-
postConfig
public void postConfig()Used by the OLCUT configuration system, and should not be called by external code.- Specified by:
postConfigin interfacecom.oracle.labs.mlrg.olcut.config.Configurable
-
toString
-
process
Description copied from interface:TextPipelineExtracts a list of features from the supplied text, using the tag to prepend the feature names.- Specified by:
processin interfaceTextPipeline- Parameters:
tag- The feature name tag.data- The text to extract.- Returns:
- The extracted features.
-
getProvenance
public com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance getProvenance()- Specified by:
getProvenancein interfacecom.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>
-