Class RegressionDataGenerator

java.lang.Object
org.tribuo.regression.example.RegressionDataGenerator

public abstract class RegressionDataGenerator extends Object
Generates two example train and test datasets, used for unit testing. They don't necessarily have linear regressed values, it's for testing the machinery rather than accuracy.

Also can generate a variety of single dimensional gaussian datasets.

  • Field Details

  • Method Details

    • multiDimDenseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimDenseTrainTest()
      Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.
      Returns:
      A pair of datasets.
    • multiDimDenseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimDenseTrainTest(double negate)
      Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.
      Parameters:
      negate - Supply -1.0 to negate some features.
      Returns:
      A pair of datasets.
    • threeDimDenseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> threeDimDenseTrainTest(double negate, boolean remapIndices)
      Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.
      Parameters:
      negate - Supply -1.0 to negate some features.
      remapIndices - If true invert the indices of the output features. Warning: this should only be used as part of unit testing, it is not expected from standard datasets.
      Returns:
      A pair of datasets.
    • multiDimSparseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimSparseTrainTest()
      Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.
      Returns:
      A pair of datasets.
    • multiDimSparseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimSparseTrainTest(double negate)
      Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.
      Parameters:
      negate - Supply -1.0 to negate some features.
      Returns:
      A pair of datasets.
    • invalidMultiDimSparseExample

      public static Example<Regressor> invalidMultiDimSparseExample()
      Generates an example with the feature ids 1,5,8, which does not intersect with the ids used elsewhere in this class. This should make the example empty at prediction time.
      Returns:
      An example with features {1:1.0,5:5.0,8:8.0}.
    • emptyMultiDimExample

      public static Example<Regressor> emptyMultiDimExample()
      Generates an example with no features.
      Returns:
      An example with no features.
    • denseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> denseTrainTest()
      Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.
      Returns:
      A pair of datasets.
    • denseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> denseTrainTest(double negate)
      Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.
      Parameters:
      negate - Supply -1.0 to negate some values in this dataset.
      Returns:
      A pair of datasets.
    • sparseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> sparseTrainTest()
      Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.
      Returns:
      A pair of datasets.
    • sparseTrainTest

      public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> sparseTrainTest(double negate)
      Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.
      Parameters:
      negate - Supply -1.0 to negate some values in this dataset.
      Returns:
      A pair of datasets.
    • invalidSparseExample

      public static Example<Regressor> invalidSparseExample()
      Generates an example with the feature ids 1,5,8, which does not intersect with the ids used elsewhere in this class. This should make the example empty at prediction time.
      Returns:
      An example with features {1:1.0,5:5.0,8:8.0}.
    • emptyExample

      public static Example<Regressor> emptyExample()
      Generates an example with no features.
      Returns:
      An example with no features.