T
- The output type of the examples in the datasource.public class TrainTestSplitter<T extends Output<T>> extends Object
Dataset
, but rather on DataSource
.Modifier and Type | Class and Description |
---|---|
static class |
TrainTestSplitter.SplitDataSourceProvenance
Provenance for a split data source.
|
Constructor and Description |
---|
TrainTestSplitter(DataSource<T> data)
Creates a splitter that splits a dataset 70/30 train and test using a default seed.
|
TrainTestSplitter(DataSource<T> data,
double trainProportion,
long seed)
Creates a splitter that will split the given data set into
a training and testing set.
|
TrainTestSplitter(DataSource<T> data,
long seed)
Creates a splitter that splits a dataset 70/30 train and test.
|
Modifier and Type | Method and Description |
---|---|
DataSource<T> |
getTest()
Gets the testing datasource.
|
DataSource<T> |
getTrain()
Gets the training data source.
|
int |
totalSize()
The total amount of data in train and test combined.
|
public TrainTestSplitter(DataSource<T> data)
data
- The data to split.public TrainTestSplitter(DataSource<T> data, long seed)
data
- The data to split.seed
- The seed for the RNG.public TrainTestSplitter(DataSource<T> data, double trainProportion, long seed)
data
- the data that we want to split.trainProportion
- the proportion of the data to select for training.
This should be a number between 0 and 1. For example, a value of 0.7 means
that 70% of the data should be selected for the training set.seed
- The seed for the RNG.public int totalSize()
public DataSource<T> getTrain()
public DataSource<T> getTest()
Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.