Uses of Class
org.tribuo.Dataset
Packages that use Dataset
Package
Description
Provides the core interfaces and classes for using Tribuo.
Provides a anomaly data generator used for testing implementations.
Provides an interface to LibSVM for anomaly detection problems.
Provides simple baseline multiclass classifiers.
Provides implementations of decision trees for classification problems.
Provides internal implementation classes for classification decision trees.
Provides majority vote ensemble combiners for classification
along with an implementation of multiclass Adaboost.
Provides a multiclass data generator used for testing implementations.
Provides a set of main methods for interacting with classification tasks.
Provides an interface to LibLinear-java for classification problems.
Provides an interface to LibSVM for classification problems.
Provides an implementation of multinomial naive bayes (i.e., naive bayes for non-negative count data).
Provides a SGD implementation of a Kernel SVM using the Pegasos algorithm.
Provides an implementation of a classification linear model using Stochastic Gradient Descent.
Provides an interface to XGBoost for classification problems.
Provides a clustering data generator used for testing implementations.
Provides a multithreaded implementation of K-Means, with a
configurable distance function.
Provides base classes for using liblinear from Tribuo.
The base interface to LibSVM.
Provides a K-Nearest Neighbours implementation which works across
all Tribuo
Output
types.Provides common functionality for building decision trees, irrespective
of the predicted
Output
.Provides abstract classes for interfacing with XGBoost abstracting away all the
Output
dependent parts.Provides classes for loading in data from disk, processing it into examples, and splitting datasets for
things like cross-validation and train-test splits.
Provides classes which can load columnar data (using a
RowProcessor
)
from a CSV (or other character delimited format) file.Provides utility datasets which subsample or otherwise
transform the wrapped dataset.
Simple data sources for ingesting or aggregating data.
Provides an interface for model prediction combinations,
two base classes for ensemble models, a base class for
ensemble excuses, and a Bagging implementation.
Evaluation base classes, along with code for train/test splits and cross validation.
This package contains the infrastructure classes for building evaluation metrics.
Provides the base interface and implementations of the
Model
hashing
which obscures the feature names stored in a model.Provides an interface to Tensorflow, allowing the training of non-sequential models using any supported
Tribuo output type.
Provides a linear algebra system used for numerical operations in Tribuo.
Provides a multi-label data generator for testing implementations.
Provides Tribuo specific infrastructure for the
Provenance
system which
tracks models and datasets.Provides simple baseline regression predictors.
Provides some example regression data generators for testing implementations.
Provides an interface to liblinear for regression problems.
Provides an interface to LibSVM for regression problems.
Provides an implementation of decision trees for regression problems.
Provides internal implementation classes for the regression trees.
Provides an implementation of linear regression using Stochastic Gradient Descent.
Provides implementations of sparse linear regression using various forms of regularisation penalty.
Provides an interface to XGBoost for regression problems.
Provides core classes for working with sequences of
Example
s.Provides infrastructure for applying transformations to a
Dataset
.-
Uses of Dataset in org.tribuo
Subclasses of Dataset in org.tribuoModifier and TypeClassDescriptionclass
ImmutableDataset<T extends Output<T>>
This is aDataset
which has anImmutableFeatureMap
to store the feature information.class
MutableDataset<T extends Output<T>>
A MutableDataset is aDataset
with aMutableFeatureMap
which grows over time.Methods in org.tribuo with parameters of type DatasetModifier and TypeMethodDescriptionstatic <T extends Output<T>>
ImmutableDataset<T> ImmutableDataset.copyDataset
(Dataset<T> dataset) Creates an immutable deep copy of the supplied dataset.static <T extends Output<T>>
ImmutableDataset<T> ImmutableDataset.copyDataset
(Dataset<T> dataset, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<T> outputIDInfo) Creates an immutable deep copy of the supplied dataset, using a different feature and output map.static <T extends Output<T>>
ImmutableDataset<T> ImmutableDataset.copyDataset
(Dataset<T> dataset, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<T> outputIDInfo, Merger merger) Creates an immutable deep copy of the supplied dataset.static <T extends Output<T>>
MutableDataset<T> MutableDataset.createDeepCopy
(Dataset<T> other) Creates a deep copy of the suppliedDataset
which is mutable.static <T extends Output<T>>
ImmutableDataset<T> ImmutableDataset.hashFeatureMap
(Dataset<T> dataset, Hasher hasher) Creates an immutable shallow copy of the supplied dataset, using the hasher to generate aHashedFeatureMap
which transparently maps from the feature name to the hashed variant.IncrementalTrainer.incrementalTrain
(Dataset<T> newData, U model) Incrementally trains the supplied model with the new data.List
<Prediction<T>> Uses the model to predict the outputs for multiple examples contained in a data set.default SparseModel
<T> Trains a sparse predictive model using the examples in the given data set.SparseTrainer.train
(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) Trains a sparse predictive model using the examples in the given data set.Trains a predictive model using the examples in the given data set.Trainer.train
(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) Trains a predictive model using the examples in the given data set. -
Uses of Dataset in org.tribuo.anomaly.example
Methods in org.tribuo.anomaly.example that return types with arguments of type DatasetModifier and TypeMethodDescriptionAnomalyDataGenerator.denseTrainTest()
Makes a simple dataset for training and testing.AnomalyDataGenerator.denseTrainTest()
Makes a simple dataset for training and testing.AnomalyDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.AnomalyDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.AnomalyDataGenerator.gaussianAnomaly()
Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.AnomalyDataGenerator.gaussianAnomaly()
Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.AnomalyDataGenerator.gaussianAnomaly
(long size, double fractionAnomalous) Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.AnomalyDataGenerator.gaussianAnomaly
(long size, double fractionAnomalous) Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.AnomalyDataGenerator.sparseTrainTest()
Makes a simple dataset for training and testing.AnomalyDataGenerator.sparseTrainTest()
Makes a simple dataset for training and testing.AnomalyDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.AnomalyDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data. -
Uses of Dataset in org.tribuo.anomaly.libsvm
Methods in org.tribuo.anomaly.libsvm with parameters of type DatasetModifier and TypeMethodDescriptionprotected com.oracle.labs.mlrg.olcut.util.Pair
<libsvm.svm_node[][], double[][]> LibSVMAnomalyTrainer.extractData
(Dataset<Event> data, ImmutableOutputInfo<Event> outputInfo, ImmutableFeatureMap featureMap) LibSVMAnomalyTrainer.train
(Dataset<Event> dataset, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance) -
Uses of Dataset in org.tribuo.classification.baseline
Methods in org.tribuo.classification.baseline with parameters of type Dataset -
Uses of Dataset in org.tribuo.classification.dtree
Methods in org.tribuo.classification.dtree with parameters of type DatasetModifier and TypeMethodDescriptionprotected AbstractTrainingNode
<Label> CARTClassificationTrainer.mkTrainingNode
(Dataset<Label> examples) -
Uses of Dataset in org.tribuo.classification.dtree.impl
Constructors in org.tribuo.classification.dtree.impl with parameters of type DatasetModifierConstructorDescriptionClassifierTrainingNode
(LabelImpurity impurity, Dataset<Label> examples) Constructor which creates the inverted file. -
Uses of Dataset in org.tribuo.classification.ensemble
Methods in org.tribuo.classification.ensemble with parameters of type Dataset -
Uses of Dataset in org.tribuo.classification.example
Methods in org.tribuo.classification.example that return types with arguments of type DatasetModifier and TypeMethodDescriptionLabelledDataGenerator.binarySparseTrainTest()
LabelledDataGenerator.binarySparseTrainTest()
LabelledDataGenerator.binarySparseTrainTest
(double negate) Generates a pair of datasets with sparse features and unknown features in the test data.LabelledDataGenerator.binarySparseTrainTest
(double negate) Generates a pair of datasets with sparse features and unknown features in the test data.LabelledDataGenerator.denseTrainTest()
LabelledDataGenerator.denseTrainTest()
LabelledDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 classes, {Foo,Bar,Baz,Quux}.LabelledDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 classes, {Foo,Bar,Baz,Quux}.LabelledDataGenerator.sparseTrainTest()
LabelledDataGenerator.sparseTrainTest()
LabelledDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.LabelledDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data. -
Uses of Dataset in org.tribuo.classification.experiments
Methods in org.tribuo.classification.experiments that return types with arguments of type Dataset -
Uses of Dataset in org.tribuo.classification.liblinear
Methods in org.tribuo.classification.liblinear with parameters of type DatasetModifier and TypeMethodDescriptionprotected com.oracle.labs.mlrg.olcut.util.Pair
<de.bwaldvogel.liblinear.FeatureNode[][], double[][]> LibLinearClassificationTrainer.extractData
(Dataset<Label> data, ImmutableOutputInfo<Label> outputInfo, ImmutableFeatureMap featureMap) -
Uses of Dataset in org.tribuo.classification.libsvm
Methods in org.tribuo.classification.libsvm with parameters of type DatasetModifier and TypeMethodDescriptionprotected com.oracle.labs.mlrg.olcut.util.Pair
<libsvm.svm_node[][], double[][]> LibSVMClassificationTrainer.extractData
(Dataset<Label> data, ImmutableOutputInfo<Label> outputInfo, ImmutableFeatureMap featureMap) -
Uses of Dataset in org.tribuo.classification.mnb
Methods in org.tribuo.classification.mnb with parameters of type Dataset -
Uses of Dataset in org.tribuo.classification.sgd.kernel
Methods in org.tribuo.classification.sgd.kernel with parameters of type Dataset -
Uses of Dataset in org.tribuo.classification.sgd.linear
Methods in org.tribuo.classification.sgd.linear with parameters of type Dataset -
Uses of Dataset in org.tribuo.classification.xgboost
Methods in org.tribuo.classification.xgboost with parameters of type Dataset -
Uses of Dataset in org.tribuo.clustering.example
Methods in org.tribuo.clustering.example that return DatasetModifier and TypeMethodDescriptionClusteringDataGenerator.gaussianClusters
(long size, long seed) Generates a dataset drawn from a mixture of 5 2d gaussians.Methods in org.tribuo.clustering.example that return types with arguments of type DatasetModifier and TypeMethodDescriptionClusteringDataGenerator.denseTrainTest()
ClusteringDataGenerator.denseTrainTest()
ClusteringDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.ClusteringDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.ClusteringDataGenerator.sparseTrainTest()
ClusteringDataGenerator.sparseTrainTest()
ClusteringDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.ClusteringDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data. -
Uses of Dataset in org.tribuo.clustering.kmeans
Methods in org.tribuo.clustering.kmeans with parameters of type DatasetModifier and TypeMethodDescriptionprotected static DenseVector[]
KMeansTrainer.initialiseCentroids
(int centroids, Dataset<ClusterID> examples, ImmutableFeatureMap featureMap, SplittableRandom rng) Initialisation method called at the start of each train call.KMeansTrainer.train
(Dataset<ClusterID> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.common.liblinear
Methods in org.tribuo.common.liblinear with parameters of type DatasetModifier and TypeMethodDescriptionprotected abstract com.oracle.labs.mlrg.olcut.util.Pair
<de.bwaldvogel.liblinear.FeatureNode[][], double[][]> LibLinearTrainer.extractData
(Dataset<T> data, ImmutableOutputInfo<T> outputInfo, ImmutableFeatureMap featureMap) Extracts the features andOutput
s in LibLinear's format.LibLinearTrainer.train
(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.common.libsvm
Methods in org.tribuo.common.libsvm with parameters of type DatasetModifier and TypeMethodDescriptionprotected abstract com.oracle.labs.mlrg.olcut.util.Pair
<libsvm.svm_node[][], double[][]> LibSVMTrainer.extractData
(Dataset<T> data, ImmutableOutputInfo<T> outputInfo, ImmutableFeatureMap featureMap) Extracts the features andOutput
s in LibLinear's format.LibSVMTrainer.train
(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.common.nearest
Methods in org.tribuo.common.nearest with parameters of type Dataset -
Uses of Dataset in org.tribuo.common.tree
Methods in org.tribuo.common.tree with parameters of type DatasetModifier and TypeMethodDescriptionprotected abstract AbstractTrainingNode
<T> AbstractCARTTrainer.mkTrainingNode
(Dataset<T> examples) AbstractCARTTrainer.train
(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.common.xgboost
Methods in org.tribuo.common.xgboost with parameters of type DatasetModifier and TypeMethodDescriptionprotected static <T extends Output<T>>
XGBoostTrainer.DMatrixTuple<T> XGBoostTrainer.convertDataset
(Dataset<T> examples) protected static <T extends Output<T>>
XGBoostTrainer.DMatrixTuple<T> XGBoostTrainer.convertDataset
(Dataset<T> examples, Function<T, Float> responseExtractor) List
<Prediction<T>> Uses the model to predict the labels for multiple examples contained in a data set. -
Uses of Dataset in org.tribuo.data
Methods in org.tribuo.data that return types with arguments of type DatasetModifier and TypeMethodDescriptionDataOptions.load
(OutputFactory<T> outputFactory) DataOptions.load
(OutputFactory<T> outputFactory) -
Uses of Dataset in org.tribuo.data.csv
Methods in org.tribuo.data.csv with parameters of type Dataset -
Uses of Dataset in org.tribuo.dataset
Subclasses of Dataset in org.tribuo.datasetModifier and TypeClassDescriptionfinal class
DatasetView<T extends Output<T>>
DatasetView provides an immutable view on anotherDataset
that only exposes selected examples.class
MinimumCardinalityDataset<T extends Output<T>>
This class creates a pruned dataset in which low frequency features that occur less than the provided minimum cardinality have been removed.Methods in org.tribuo.dataset with parameters of type DatasetModifier and TypeMethodDescriptionstatic <T extends Output<T>>
DatasetView<T> DatasetView.createBootstrapView
(Dataset<T> dataset, int size, long seed) Generates a DatasetView bootstrapped from the supplied Dataset.static <T extends Output<T>>
DatasetView<T> DatasetView.createBootstrapView
(Dataset<T> dataset, int size, long seed, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> outputIDs) Generates a DatasetView bootstrapped from the supplied Dataset.static <T extends Output<T>>
DatasetView<T> DatasetView.createView
(Dataset<T> dataset, Predicate<Example<T>> predicate, String tag) Creates a view from the supplied dataset, using the specified predicate to test if each example should be in this view.static <T extends Output<T>>
DatasetView<T> DatasetView.createWeightedBootstrapView
(Dataset<T> dataset, int size, long seed, float[] exampleWeights) Generates a DatasetView bootstrapped from the supplied Dataset using the supplied example weights.static <T extends Output<T>>
DatasetView<T> DatasetView.createWeightedBootstrapView
(Dataset<T> dataset, int size, long seed, float[] exampleWeights, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> outputIDs) Generates a DatasetView bootstrapped from the supplied Dataset using the supplied example weights.Constructors in org.tribuo.dataset with parameters of type DatasetModifierConstructorDescriptionDatasetView
(Dataset<T> dataset, int[] exampleIndices, String tag) Creates a DatasetView which includes the supplied indices from the dataset.DatasetView
(Dataset<T> dataset, int[] exampleIndices, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> labelIDs, String tag) Creates a DatasetView which includes the supplied indices from the dataset.MinimumCardinalityDataset
(Dataset<T> dataset, int minCardinality) -
Uses of Dataset in org.tribuo.datasource
Methods in org.tribuo.datasource with parameters of type DatasetModifier and TypeMethodDescriptionstatic <T extends Output<T>>
voidLibSVMDataSource.writeLibSVMFormat
(Dataset<T> dataset, PrintStream out, boolean zeroIndexed, Function<T, Number> transformationFunc) Writes out a dataset in LibSVM format. -
Uses of Dataset in org.tribuo.ensemble
Methods in org.tribuo.ensemble with parameters of type DatasetModifier and TypeMethodDescriptionBaggingTrainer.train
(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) BaggingTrainer.trainSingleModel
(Dataset<T> examples, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> labelIDs, SplittableRandom localRNG, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.evaluation
Methods in org.tribuo.evaluation with parameters of type DatasetModifier and TypeMethodDescriptionstatic <T extends Output<T>, C extends MetricContext<T>>
com.oracle.labs.mlrg.olcut.util.Pair<Integer, Double> EvaluationAggregator.argmax
(EvaluationMetric<T, C> metric, List<? extends Model<T>> models, Dataset<T> dataset) Calculates the argmax of a metric across the supplied models (i.e., the index of the model which performed the best).final E
Produces an evaluation for the supplied model and dataset, by callingModel.predict(org.tribuo.Example<T>)
to create the predictions, then aggregating the appropriate statistics.Evaluates the dataset using the supplied model, returning an immutableEvaluation
of the appropriate type.Splits a dataset into k consecutive folds; for each fold, the remaining k-1 folds form the training set.static <T extends Output<T>, C extends MetricContext<T>>
DescriptiveStatsEvaluationAggregator.summarize
(List<? extends EvaluationMetric<T, C>> metrics, Model<T> model, Dataset<T> dataset) Summarize model performance on dataset across several metrics.static <T extends Output<T>, R extends Evaluation<T>>
Map<MetricID<T>, DescriptiveStats> EvaluationAggregator.summarize
(Evaluator<T, R> evaluator, List<? extends Model<T>> models, Dataset<T> dataset) Summarize performance using the supplied evaluator across several models on one dataset.static <T extends Output<T>, C extends MetricContext<T>>
DescriptiveStatsEvaluationAggregator.summarize
(EvaluationMetric<T, C> metric, List<? extends Model<T>> models, Dataset<T> dataset) Summarize performance w.r.t.Method parameters in org.tribuo.evaluation with type arguments of type DatasetModifier and TypeMethodDescriptionstatic <T extends Output<T>, C extends MetricContext<T>>
com.oracle.labs.mlrg.olcut.util.Pair<Integer, Double> EvaluationAggregator.argmax
(EvaluationMetric<T, C> metric, Model<T> model, List<? extends Dataset<T>> datasets) Calculates the argmax of a metric across the supplied datasets.static <T extends Output<T>, R extends Evaluation<T>>
Map<MetricID<T>, DescriptiveStats> EvaluationAggregator.summarize
(Evaluator<T, R> evaluator, Model<T> model, List<? extends Dataset<T>> datasets) Summarize performance according to evaluator for a single model across several datasets.static <T extends Output<T>, C extends MetricContext<T>>
DescriptiveStatsEvaluationAggregator.summarize
(EvaluationMetric<T, C> metric, Model<T> model, List<? extends Dataset<T>> datasets) Summarize a model's performance w.r.t.Constructors in org.tribuo.evaluation with parameters of type Dataset -
Uses of Dataset in org.tribuo.evaluation.metrics
Methods in org.tribuo.evaluation.metrics with parameters of type DatasetModifier and TypeMethodDescriptiondefault C
EvaluationMetric.createContext
(Model<T> model, Dataset<T> dataset) Creates the metric context used to compute this metric's value, generatingPrediction
s for eachExample
in the supplied dataset. -
Uses of Dataset in org.tribuo.hash
Methods in org.tribuo.hash with parameters of type Dataset -
Uses of Dataset in org.tribuo.interop.tensorflow
Methods in org.tribuo.interop.tensorflow that return types with arguments of type DatasetModifier and TypeMethodDescriptionTrainTest.load
(Path trainingPath, Path testingPath, OutputFactory<Label> outputFactory) TrainTest.load
(Path trainingPath, Path testingPath, OutputFactory<Label> outputFactory) Methods in org.tribuo.interop.tensorflow with parameters of type Dataset -
Uses of Dataset in org.tribuo.math.la
Methods in org.tribuo.math.la with parameters of type DatasetModifier and TypeMethodDescriptionstatic <T extends Output<T>>
SparseVector[]Converts a dataset of row-major examples into an array of column-major sparse vectors.static <T extends Output<T>>
SparseVector[]SparseVector.transpose
(Dataset<T> dataset, ImmutableFeatureMap fMap) Converts a dataset of row-major examples into an array of column-major sparse vectors. -
Uses of Dataset in org.tribuo.multilabel.baseline
Methods in org.tribuo.multilabel.baseline with parameters of type DatasetModifier and TypeMethodDescriptionIndependentMultiLabelTrainer.train
(Dataset<MultiLabel> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.multilabel.example
Methods in org.tribuo.multilabel.example that return DatasetModifier and TypeMethodDescriptionstatic Dataset
<MultiLabel> MultiLabelDataGenerator.generateTestData()
static Dataset
<MultiLabel> MultiLabelDataGenerator.generateTrainData()
Methods in org.tribuo.multilabel.example that return types with arguments of type DatasetModifier and TypeMethodDescriptionstatic com.oracle.labs.mlrg.olcut.util.Pair
<Dataset<MultiLabel>, Dataset<MultiLabel>> MultiLabelDataGenerator.generateDataset()
static com.oracle.labs.mlrg.olcut.util.Pair
<Dataset<MultiLabel>, Dataset<MultiLabel>> MultiLabelDataGenerator.generateDataset()
-
Uses of Dataset in org.tribuo.provenance
Constructors in org.tribuo.provenance with parameters of type DatasetModifierConstructorDescription<T extends Output<T>>
DatasetProvenance
(DataProvenance sourceProvenance, com.oracle.labs.mlrg.olcut.provenance.ListProvenance<com.oracle.labs.mlrg.olcut.provenance.ObjectProvenance> transformationProvenance, Dataset<T> dataset) -
Uses of Dataset in org.tribuo.regression.baseline
Methods in org.tribuo.regression.baseline with parameters of type Dataset -
Uses of Dataset in org.tribuo.regression.example
Methods in org.tribuo.regression.example that return DatasetModifier and TypeMethodDescriptionGaussianDataSource.generateDataset
(int numSamples, float slope, float intercept, float variance, float xMin, float xMax, long seed) Generates a single dimensional output drawn from N(slope*x + intercept,variance).Methods in org.tribuo.regression.example that return types with arguments of type DatasetModifier and TypeMethodDescriptionRegressionDataGenerator.denseTrainTest()
RegressionDataGenerator.denseTrainTest()
RegressionDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.RegressionDataGenerator.denseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.RegressionDataGenerator.multiDimDenseTrainTest()
RegressionDataGenerator.multiDimDenseTrainTest()
RegressionDataGenerator.multiDimDenseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.RegressionDataGenerator.multiDimDenseTrainTest
(double negate) Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.RegressionDataGenerator.multiDimSparseTrainTest()
RegressionDataGenerator.multiDimSparseTrainTest()
RegressionDataGenerator.multiDimSparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.RegressionDataGenerator.multiDimSparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.RegressionDataGenerator.sparseTrainTest()
RegressionDataGenerator.sparseTrainTest()
RegressionDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.RegressionDataGenerator.sparseTrainTest
(double negate) Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data. -
Uses of Dataset in org.tribuo.regression.impl
Methods in org.tribuo.regression.impl with parameters of type DatasetModifier and TypeMethodDescriptionSkeletalIndependentRegressionSparseTrainer.train
(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) SkeletalIndependentRegressionTrainer.train
(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.regression.liblinear
Methods in org.tribuo.regression.liblinear with parameters of type DatasetModifier and TypeMethodDescriptionprotected com.oracle.labs.mlrg.olcut.util.Pair
<de.bwaldvogel.liblinear.FeatureNode[][], double[][]> LibLinearRegressionTrainer.extractData
(Dataset<Regressor> data, ImmutableOutputInfo<Regressor> outputInfo, ImmutableFeatureMap featureMap) -
Uses of Dataset in org.tribuo.regression.libsvm
Methods in org.tribuo.regression.libsvm with parameters of type DatasetModifier and TypeMethodDescriptionprotected com.oracle.labs.mlrg.olcut.util.Pair
<libsvm.svm_node[][], double[][]> LibSVMRegressionTrainer.extractData
(Dataset<Regressor> data, ImmutableOutputInfo<Regressor> outputInfo, ImmutableFeatureMap featureMap) -
Uses of Dataset in org.tribuo.regression.rtree
Methods in org.tribuo.regression.rtree with parameters of type DatasetModifier and TypeMethodDescriptionprotected AbstractTrainingNode
<Regressor> CARTJointRegressionTrainer.mkTrainingNode
(Dataset<Regressor> examples) protected AbstractTrainingNode
<Regressor> CARTRegressionTrainer.mkTrainingNode
(Dataset<Regressor> examples) CARTRegressionTrainer.train
(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) -
Uses of Dataset in org.tribuo.regression.rtree.impl
Methods in org.tribuo.regression.rtree.impl with parameters of type DatasetModifier and TypeMethodDescriptionRegressorTrainingNode.invertData
(Dataset<Regressor> examples) Inverts a training dataset from row major to column major.Constructors in org.tribuo.regression.rtree.impl with parameters of type DatasetModifierConstructorDescriptionJointRegressorTrainingNode
(RegressorImpurity impurity, Dataset<Regressor> examples, boolean normalize) Constructor which creates the inverted file. -
Uses of Dataset in org.tribuo.regression.sgd.linear
Methods in org.tribuo.regression.sgd.linear with parameters of type Dataset -
Uses of Dataset in org.tribuo.regression.slm
Methods in org.tribuo.regression.slm with parameters of type DatasetModifier and TypeMethodDescriptionElasticNetCDTrainer.train
(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) SLMTrainer.train
(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance) Trains a sparse linear model. -
Uses of Dataset in org.tribuo.regression.xgboost
Methods in org.tribuo.regression.xgboost with parameters of type Dataset -
Uses of Dataset in org.tribuo.sequence
Methods in org.tribuo.sequence that return DatasetModifier and TypeMethodDescriptionSequenceDataset.getFlatDataset()
Returns a view on this SequenceDataset which aggregates all the examples and ignores the sequence structure. -
Uses of Dataset in org.tribuo.transform
Methods in org.tribuo.transform with parameters of type DatasetModifier and TypeMethodDescriptionList
<Prediction<T>> TransformTrainer.train
(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance) <T extends Output<T>>
MutableDataset<T> TransformerMap.transformDataset
(Dataset<T> dataset) Copies the supplied dataset and applies the transformers to each example in it.<T extends Output<T>>
MutableDataset<T> TransformerMap.transformDataset
(Dataset<T> dataset, boolean densify) Copies the supplied dataset and applies the transformers to each example in it.