Class DefaultFeatureExtractor

java.lang.Object
org.tribuo.classification.sequence.viterbi.DefaultFeatureExtractor
All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable, com.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>, Serializable, LabelFeatureExtractor, ProtoSerializable<org.tribuo.classification.protos.LabelFeatureExtractorProto>

public class DefaultFeatureExtractor extends Object implements LabelFeatureExtractor
A label feature extractor that produces several kinds of label-based features.

The options are: the most recent output, the least recent output, recent bigrams, recent trigrams, recent 4-grams.

See Also:
  • Field Details

    • CURRENT_VERSION

      public static final int CURRENT_VERSION
      Protobuf serialization version.
      See Also:
  • Constructor Details

    • DefaultFeatureExtractor

      public DefaultFeatureExtractor()
      Constructs a default feature extractor for bigrams and trigrams using the past 3 outcomes.
    • DefaultFeatureExtractor

      public DefaultFeatureExtractor(int mostRecentOutcome, int leastRecentOutcome, boolean useBigram, boolean useTrigram, boolean use4gram)
      Constructs a default feature extractor using the supplied parameters.
      Parameters:
      mostRecentOutcome - The most recent outcome to include as a feature.
      leastRecentOutcome - The least recent outcome to include as a feature.
      useBigram - Use bigrams of the outcomes.
      useTrigram - Use trigrams of the outcomes.
      use4gram - Use 4-grams of the outcomes.
  • Method Details

    • deserializeFromProto

      public static DefaultFeatureExtractor deserializeFromProto(int version, String className, com.google.protobuf.Any message) throws com.google.protobuf.InvalidProtocolBufferException
      Deserialization factory.
      Parameters:
      version - The serialized object version.
      className - The class name.
      message - The serialized data.
      Returns:
      The deserialized object.
      Throws:
      com.google.protobuf.InvalidProtocolBufferException - If the protobuf could not be parsed from the message.
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • extractFeatures

      public List<Feature> extractFeatures(List<Label> previousOutcomes, double value)
      Description copied from interface: LabelFeatureExtractor
      Generates features based on the previously produced labels.
      Specified by:
      extractFeatures in interface LabelFeatureExtractor
      Parameters:
      previousOutcomes - The previous step's labels.
      value - The value to give to the features.
      Returns:
      Features.
    • getProvenance

      public com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance getProvenance()
      Specified by:
      getProvenance in interface com.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>
    • serialize

      public org.tribuo.classification.protos.LabelFeatureExtractorProto serialize()
      Description copied from interface: ProtoSerializable
      Serializes this object to a protobuf.
      Specified by:
      serialize in interface ProtoSerializable<org.tribuo.classification.protos.LabelFeatureExtractorProto>
      Returns:
      The protobuf.