Package org.tribuo.regression.example
Class RegressionDataGenerator
java.lang.Object
org.tribuo.regression.example.RegressionDataGenerator
Generates two example train and test datasets, used for unit testing.
They don't necessarily have linear regressed values,
it's for testing the machinery rather than accuracy.
Also can generate a variety of single dimensional gaussian datasets.
-
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionGenerates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.Generates an example with no features.Generates an example with no features.Generates an example with the feature ids 1,5,8, which does not intersect with the ids used elsewhere in this class.Generates an example with the feature ids 1,5,8, which does not intersect with the ids used elsewhere in this class.Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.multiDimDenseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.multiDimSparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.threeDimDenseTrainTest
(double negate, boolean remapIndices) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.
-
Field Details
-
firstDimensionName
Name of the first output dimension.- See Also:
-
secondDimensionName
Name of the second output dimension.- See Also:
-
thirdDimensionName
Name of the third output dimension.- See Also:
-
SINGLE_DIM_NAME
Name of the single dimension.- See Also:
-
-
Method Details
-
multiDimDenseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimDenseTrainTest()Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.- Returns:
- A pair of datasets.
-
multiDimDenseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimDenseTrainTest(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.- Parameters:
negate
- Supply -1.0 to negate some features.- Returns:
- A pair of datasets.
-
threeDimDenseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> threeDimDenseTrainTest(double negate, boolean remapIndices) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.- Parameters:
negate
- Supply -1.0 to negate some features.remapIndices
- If true invert the indices of the output features. Warning: this should only be used as part of unit testing, it is not expected from standard datasets.- Returns:
- A pair of datasets.
-
multiDimSparseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimSparseTrainTest()Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.- Returns:
- A pair of datasets.
-
multiDimSparseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> multiDimSparseTrainTest(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.- Parameters:
negate
- Supply -1.0 to negate some features.- Returns:
- A pair of datasets.
-
invalidMultiDimSparseExample
Generates an example with the feature ids 1,5,8, which does not intersect with the ids used elsewhere in this class. This should make the example empty at prediction time.- Returns:
- An example with features {1:1.0,5:5.0,8:8.0}.
-
emptyMultiDimExample
Generates an example with no features.- Returns:
- An example with no features.
-
denseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> denseTrainTest()Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.- Returns:
- A pair of datasets.
-
denseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> denseTrainTest(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.- Parameters:
negate
- Supply -1.0 to negate some values in this dataset.- Returns:
- A pair of datasets.
-
sparseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> sparseTrainTest()Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.- Returns:
- A pair of datasets.
-
sparseTrainTest
public static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>,Dataset<Regressor>> sparseTrainTest(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.- Parameters:
negate
- Supply -1.0 to negate some values in this dataset.- Returns:
- A pair of datasets.
-
invalidSparseExample
Generates an example with the feature ids 1,5,8, which does not intersect with the ids used elsewhere in this class. This should make the example empty at prediction time.- Returns:
- An example with features {1:1.0,5:5.0,8:8.0}.
-
emptyExample
Generates an example with no features.- Returns:
- An example with no features.
-