Class SequenceDataGenerator

java.lang.Object
org.tribuo.classification.sequence.example.SequenceDataGenerator

public final class SequenceDataGenerator extends Object
A data generator for smoke testing sequence label models.
  • Method Details

    • generateGorillaDataset

      public static MutableSequenceDataset<Label> generateGorillaDataset(int numCopies)
      Generates a simple dataset consisting of numCopies repeats of two sequences.
      Parameters:
      numCopies - The number of times to repeat the two sequence examples.
      Returns:
      The dataset.
    • generateGorillaA

      public static SequenceExample<Label> generateGorillaA()
      Generates a sequence example with a mixture of features and three labels "O", "Status" and "Monkey".
      Returns:
      A sequence example.
    • generateGorillaB

      public static SequenceExample<Label> generateGorillaB()
      Generates a sequence example with a mixture of features and three labels "O", "Status" and "Monkey".
      Returns:
      A sequence example.
    • generateInvalidExample

      public static SequenceExample<Label> generateInvalidExample()
      This generates a sequence example with features that are unused by the training data.
      Returns:
      A SequenceExample which is invalid in the context of the Gorilla example data.
    • generateOtherInvalidExample

      public static SequenceExample<Label> generateOtherInvalidExample()
      This generates a sequence example where the first example has no features.
      Returns:
      A SequenceExample which is invalid as one example contains no features.
    • generateEmptyExample

      public static SequenceExample<Label> generateEmptyExample()
      This generates a sequence example with no examples.
      Returns:
      A SequenceExample which is invalid as it contains no examples.