Class SequenceExample<T extends Output<T>>

java.lang.Object
org.tribuo.sequence.SequenceExample<T>
All Implemented Interfaces:
Serializable, Iterable<Example<T>>, ProtoSerializable<org.tribuo.protos.core.SequenceExampleProto>

public class SequenceExample<T extends Output<T>> extends Object implements Iterable<Example<T>>, ProtoSerializable<org.tribuo.protos.core.SequenceExampleProto>, Serializable
A sequence of examples, used for sequence classification.
See Also:
  • Field Details

    • CURRENT_VERSION

      public static final int CURRENT_VERSION
      Protobuf serialization version.
      See Also:
    • DEFAULT_WEIGHT

      public static final float DEFAULT_WEIGHT
      The default sequence example weight.
      See Also:
  • Constructor Details

    • SequenceExample

      public SequenceExample()
      Creates an empty sequence example.
    • SequenceExample

      public SequenceExample(List<Example<T>> examples)
      Creates a sequence example from the list of examples.

      The examples are not copied by this method.

      Parameters:
      examples - The examples to incorporate.
    • SequenceExample

      public SequenceExample(List<Example<T>> examples, float weight)
      Creates a sequence example from the list of examples, setting the weight.

      The examples are encapsulated by this constructor, not copied.

      Parameters:
      examples - The examples to incorporate.
      weight - The weight of this sequence.
    • SequenceExample

      public SequenceExample(List<T> outputs, List<? extends List<? extends Feature>> features)
      Creates a sequence example from the supplied outputs and list of list of features.

      The features are copied out by this constructor. The outputs and features lists must be of the same length. Sets the weight to DEFAULT_WEIGHT.

      Parameters:
      outputs - The outputs for each sequence element.
      features - The features for each sequence element.
    • SequenceExample

      public SequenceExample(List<T> outputs, List<? extends List<? extends Feature>> features, float weight)
      Creates a sequence example from the supplied weight, outputs and list of list of features.

      The features are copied out by this constructor. The outputs and features lists must be of the same length.

      Parameters:
      outputs - The outputs for each sequence element.
      features - The features for each sequence element.
      weight - The weight for this sequence example.
    • SequenceExample

      public SequenceExample(List<T> outputs, List<? extends List<? extends Feature>> features, boolean attemptBinaryFeatures)
      Creates a sequence example from the supplied weight, outputs and list of list of features.

      The features are copied out by this constructor. The outputs and features lists must be of the same length. Sets the weight to DEFAULT_WEIGHT.

      Parameters:
      outputs - The outputs for each sequence element.
      features - The features for each sequence element.
      attemptBinaryFeatures - Attempt to use BinaryFeaturesExample as the inner examples.
    • SequenceExample

      public SequenceExample(List<T> outputs, List<? extends List<? extends Feature>> features, float weight, boolean attemptBinaryFeatures)
      Creates a sequence example from the supplied weight, outputs and list of list of features.

      The features are copied out by this constructor. The outputs and features lists must be of the same length.

      Parameters:
      outputs - The outputs for each sequence element.
      features - The features for each sequence element.
      weight - The weight for this sequence example.
      attemptBinaryFeatures - Attempt to use BinaryFeaturesExample as the inner examples.
    • SequenceExample

      public SequenceExample(SequenceExample<T> other)
      Creates a deep copy of the supplied sequence example.
      Parameters:
      other - The sequence example to copy.
  • Method Details

    • deserializeFromProto

      public static SequenceExample<?> deserializeFromProto(int version, String className, com.google.protobuf.Any message) throws com.google.protobuf.InvalidProtocolBufferException
      Deserialization factory.
      Parameters:
      version - The serialized object version.
      className - The class name.
      message - The serialized data.
      Returns:
      The deserialized object.
      Throws:
      com.google.protobuf.InvalidProtocolBufferException - If the protobuf could not be parsed from the message.
    • deserialize

      public static SequenceExample<?> deserialize(org.tribuo.protos.core.SequenceExampleProto e)
      Deserialization shortcut, used to firm up the types.
      Parameters:
      e - The proto to deserialize.
      Returns:
      The sequence example.
    • size

      public int size()
      Return how many examples are in this sequence.
      Returns:
      The number of examples.
    • removeFeatures

      public void removeFeatures(List<Feature> features)
      Removes the features in the supplied list from each example contained in this sequence.
      Parameters:
      features - The features to remove.
    • get

      public Example<T> get(int i)
      Gets the example found at the specified index.
      Parameters:
      i - The index to lookup.
      Returns:
      The Example for index i.
    • validateExample

      public boolean validateExample()
      Checks that each Example in this sequence is valid.
      Returns:
      True if each Example is valid, false otherwise.
    • reduceByName

      public void reduceByName(Merger merger)
      Reduces the features in each example using the supplied Merger.
      Parameters:
      merger - The merger to use in the reduction.
    • setWeight

      public void setWeight(float weight)
      Sets the weight of this sequence.
      Parameters:
      weight - The new weight.
    • getWeight

      public float getWeight()
      Gets the weight of this sequence.
      Returns:
      The weight of this sequence.
    • addExample

      public void addExample(Example<T> e)
      Adds an Example to this sequence.
      Parameters:
      e - The example to add.
    • copy

      public SequenceExample<T> copy()
      Returns a deep copy of this SequenceExample.
      Returns:
      A deep copy.
    • iterator

      public Iterator<Example<T>> iterator()
      Specified by:
      iterator in interface Iterable<T extends Output<T>>
    • featureIterator

      public Iterator<Feature> featureIterator()
      Creates an iterator over every feature in this sequence.
      Returns:
      An iterator over features.
    • isDense

      public boolean isDense(FeatureMap fMap)
      Is this sequence example dense wrt the supplied feature map.

      A sequence example is "dense" if each example inside it contains all the features in the map, and only those features.

      Parameters:
      fMap - The feature map to check against.
      Returns:
      True if this sequence example contains only the features in the map, and all the features in the map.
    • densify

      public void densify(FeatureMap fMap)
      Converts all implicit zeros into explicit zeros based on the supplied feature map.
      Parameters:
      fMap - The feature map to use for densification.
    • canonicalise

      public void canonicalise(FeatureMap featureMap)
      Reassigns feature name Strings in each Example inside this SequenceExample to point to those in the FeatureMap. This significantly reduces memory allocation. It is called when a SequenceExample is added to a MutableSequenceDataset, and should not be called outside of that context as it may interact unexpectedly with HashedFeatureMap.
      Parameters:
      featureMap - The feature map containing canonical feature names.
    • serialize

      public org.tribuo.protos.core.SequenceExampleProto serialize()
      Description copied from interface: ProtoSerializable
      Serializes this object to a protobuf.
      Specified by:
      serialize in interface ProtoSerializable<T extends Output<T>>
      Returns:
      The protobuf.
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • createWithEmptyOutputs

      public static <T extends Output<T>> SequenceExample<T> createWithEmptyOutputs(List<? extends List<? extends Feature>> features, OutputFactory<T> outputFactory)
      Creates a SequenceExample using OutputFactory.getUnknownOutput() as the output for each sequence element.

      Note: this method is used to create SequenceExamples at prediction time when there is no ground truth Output.

      Type Parameters:
      T - The type of the Output.
      Parameters:
      features - The features for each sequence element.
      outputFactory - The output factory to use.
      Returns:
      A new SequenceExample.