Uses of Class org.tribuo.Dataset (Tribuo 4.3.2 API)

Provides a multiclass data generator used for testing implementations, along with several synthetic data generators for 2d binary classification problems to be used in demos or tutorials.

org.tribuo.classification.experiments

Provides a set of main methods for interacting with classification tasks.

org.tribuo.classification.fs

Information theoretic feature selection algorithms.

org.tribuo.classification.liblinear

Provides an interface to LibLinear-java for classification problems.

org.tribuo.classification.libsvm

Provides an interface to LibSVM for classification problems.

org.tribuo.classification.mnb

Provides an implementation of multinomial naive bayes (i.e., naive bayes for non-negative count data).

org.tribuo.classification.sgd.kernel

Provides a SGD implementation of a Kernel SVM using the Pegasos algorithm.

org.tribuo.classification.xgboost

Provides an interface to XGBoost for classification problems.

org.tribuo.clustering.example

Provides clustering data generators used for demos and testing implementations.

org.tribuo.clustering.hdbscan

Provides an implementation of HDBSCAN*.

org.tribuo.clustering.kmeans

Provides a multithreaded implementation of K-Means, with a configurable distance function.

org.tribuo.common.liblinear

Provides base classes for using liblinear from Tribuo.

org.tribuo.common.libsvm

The base interface to LibSVM.

org.tribuo.common.nearest

Provides a K-Nearest Neighbours implementation which works across all Tribuo Output types.

org.tribuo.common.sgd

Provides the base classes for models trained with stochastic gradient descent.

org.tribuo.common.tree

Provides common functionality for building decision trees, irrespective of the predicted Output.

org.tribuo.common.xgboost

Provides abstract classes for interfacing with XGBoost abstracting away all the Output dependent parts.

org.tribuo.data

Provides classes for loading in data from disk, processing it into examples, and splitting datasets for things like cross-validation and train-test splits.

org.tribuo.data.csv

Provides classes which can load columnar data (using a RowProcessor) from a CSV (or other character delimited format) file.

org.tribuo.dataset

Provides utility datasets which subsample or otherwise transform the wrapped dataset.

org.tribuo.datasource

Simple data sources for ingesting or aggregating data.

org.tribuo.ensemble

Provides an interface for model prediction combinations, two base classes for ensemble models, a base class for ensemble excuses, and a Bagging implementation.

org.tribuo.evaluation

Evaluation base classes, along with code for train/test splits and cross validation.

org.tribuo.evaluation.metrics

This package contains the infrastructure classes for building evaluation metrics.

org.tribuo.hash

Provides the base interface and implementations of the Model hashing which obscures the feature names stored in a model.

org.tribuo.interop.tensorflow

Provides an interface to TensorFlow, allowing the training of non-sequential models using any supported Tribuo output type.

org.tribuo.math.la

Provides a linear algebra system used for numerical operations in Tribuo.

org.tribuo.multilabel.baseline

Provides implementations of binary relevance based multi-label classification algorithms.

org.tribuo.multilabel.ensemble

Provides a multi-label ensemble combiner that performs a (possibly weighted) majority vote among each label independently, along with an implementation of classifier chain ensembles.

org.tribuo.multilabel.example

Provides a multi-label data generator for testing implementations and a configurable data source suitable for demos and tests.

org.tribuo.provenance

Provides Tribuo specific infrastructure for the Provenance system which tracks models and datasets.

org.tribuo.regression.baseline

Provides simple baseline regression predictors.

org.tribuo.regression.example

Provides some example regression data generators for testing implementations.

org.tribuo.regression.impl

Provides skeletal implementations of Regressor Trainer that can wrap a single dimension trainer/model and produce one prediction per dimension independently.

org.tribuo.regression.liblinear

Provides an interface to liblinear for regression problems.

org.tribuo.regression.libsvm

Provides an interface to LibSVM for regression problems.

org.tribuo.regression.rtree

Provides an implementation of decision trees for regression problems.

org.tribuo.regression.rtree.impl

Provides internal implementation classes for the regression trees.

org.tribuo.regression.slm

Provides implementations of sparse linear regression using various forms of regularisation penalty.

org.tribuo.regression.xgboost

Provides an interface to XGBoost for regression problems.

org.tribuo.reproducibility

Reproducibility utility based on Tribuo's provenance objects.

org.tribuo.sequence

Provides core classes for working with sequences of Examples.

org.tribuo.transform

Provides infrastructure for applying transformations to a Dataset.

Uses of Dataset in org.tribuo

Subclasses of Dataset in org.tribuo

Modifier and Type

Class

Description

class

ImmutableDataset<T extends Output<T>>

This is a Dataset which has an ImmutableFeatureMap to store the feature information.

class

MutableDataset<T extends Output<T>>

A MutableDataset is a Dataset with a MutableFeatureMap which grows over time.

Methods in org.tribuo that return Dataset

Modifier and Type

Method

Description

static <T extends Output<T>> Dataset<T>

Dataset.castDataset(Dataset<?> inputDataset, Class<T> outputType)

Casts the dataset to the specified output type, assuming it is valid.

static Dataset<?>

Dataset.deserialize(org.tribuo.protos.core.DatasetProto datasetProto)

Deserializes a dataset proto into a dataset.

static Dataset<?>

Dataset.deserializeFromFile(Path path)

Reads an instance of DatasetProto from the supplied path and deserializes it.

static Dataset<?>

Dataset.deserializeFromStream(InputStream is)

Reads an instance of DatasetProto from the supplied input stream and deserializes it.

Methods in org.tribuo with parameters of type Dataset

Modifier and Type

Method

Description

static <T extends Output<T>> Dataset<T>

Dataset.castDataset(Dataset<?> inputDataset, Class<T> outputType)

Casts the dataset to the specified output type, assuming it is valid.

static <T extends Output<T>> ImmutableDataset<T>

ImmutableDataset.copyDataset(Dataset<T> dataset)

Creates an immutable deep copy of the supplied dataset.

static <T extends Output<T>> ImmutableDataset<T>

ImmutableDataset.copyDataset(Dataset<T> dataset, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<T> outputIDInfo)

Creates an immutable deep copy of the supplied dataset, using a different feature and output map.

static <T extends Output<T>> ImmutableDataset<T>

ImmutableDataset.copyDataset(Dataset<T> dataset, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<T> outputIDInfo, Merger merger)

Creates an immutable deep copy of the supplied dataset.

static <T extends Output<T>> MutableDataset<T>

MutableDataset.createDeepCopy(Dataset<T> other)

Creates a deep copy of the supplied Dataset which is mutable.

static <T extends Output<T>> ImmutableDataset<T>

ImmutableDataset.hashFeatureMap(Dataset<T> dataset, Hasher hasher)

Creates an immutable shallow copy of the supplied dataset, using the hasher to generate a HashedFeatureMap which transparently maps from the feature name to the hashed variant.

U

IncrementalTrainer.incrementalTrain(Dataset<T> newData, U model)

Incrementally trains the supplied model with the new data.

List<Prediction<T>>

Model.predict(Dataset<T> examples)

Uses the model to predict the outputs for multiple examples contained in a data set.

SelectedFeatureSet

FeatureSelector.select(Dataset<T> dataset)

Selects features according to this selection algorithm from the specified dataset.

default SparseModel<T>

SparseTrainer.train(Dataset<T> examples)

Trains a sparse predictive model using the examples in the given data set.

SparseModel<T>

SparseTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

Trains a sparse predictive model using the examples in the given data set.

default SparseModel<T>

SparseTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

Trains a predictive model using the examples in the given data set.

default Model<T>

Trainer.train(Dataset<T> examples)

Trains a predictive model using the examples in the given data set.

Model<T>

Trainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

Trains a predictive model using the examples in the given data set.

default Model<T>

Trainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

Trains a predictive model using the examples in the given data set.
Uses of Dataset in org.tribuo.anomaly.example

Methods in org.tribuo.anomaly.example that return types with arguments of type Dataset

Modifier and Type

Method

Description

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.denseTrainTest()

Makes a simple dataset for training and testing.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.denseTrainTest()

Makes a simple dataset for training and testing.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.gaussianAnomaly()

Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.gaussianAnomaly()

Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.gaussianAnomaly(long size, double fractionAnomalous)

Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.gaussianAnomaly(long size, double fractionAnomalous)

Generates two datasets, one without anomalies drawn from a single gaussian and the second drawn from a mixture of two gaussians, with the second tagged anomalous.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.sparseTrainTest()

Makes a simple dataset for training and testing.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.sparseTrainTest()

Makes a simple dataset for training and testing.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Event>, Dataset<Event>>

AnomalyDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.
Uses of Dataset in org.tribuo.anomaly.liblinear

Methods in org.tribuo.anomaly.liblinear with parameters of type Dataset

Modifier and Type

Method

Description

protected com.oracle.labs.mlrg.olcut.util.Pair<de.bwaldvogel.liblinear.FeatureNode[][], double[][]>

LibLinearAnomalyTrainer.extractData(Dataset<Event> data, ImmutableOutputInfo<Event> outputInfo, ImmutableFeatureMap featureMap)
Uses of Dataset in org.tribuo.anomaly.libsvm

Methods in org.tribuo.anomaly.libsvm with parameters of type Dataset

Modifier and Type

Method

Description

protected com.oracle.labs.mlrg.olcut.util.Pair<libsvm.svm_node[][], double[][]>

LibSVMAnomalyTrainer.extractData(Dataset<Event> data, ImmutableOutputInfo<Event> outputInfo, ImmutableFeatureMap featureMap)

LibSVMModel<Event>

LibSVMAnomalyTrainer.train(Dataset<Event> dataset, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance)
Uses of Dataset in org.tribuo.classification.baseline

Methods in org.tribuo.classification.baseline with parameters of type Dataset

Modifier and Type

Method

Description

Model<Label>

DummyClassifierTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance)

Model<Label>

DummyClassifierTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance, int invocationCount)
Uses of Dataset in org.tribuo.classification.dtree

Methods in org.tribuo.classification.dtree with parameters of type Dataset

Modifier and Type

Method

Description

protected AbstractTrainingNode<Label>

CARTClassificationTrainer.mkTrainingNode(Dataset<Label> examples, AbstractTrainingNode.LeafDeterminer leafDeterminer)
Uses of Dataset in org.tribuo.classification.dtree.impl

Constructors in org.tribuo.classification.dtree.impl with parameters of type Dataset

Modifier

Constructor

Description

ClassifierTrainingNode(LabelImpurity impurity, Dataset<Label> examples, AbstractTrainingNode.LeafDeterminer leafDeterminer)

Constructor which creates the inverted file.
Uses of Dataset in org.tribuo.classification.ensemble

Methods in org.tribuo.classification.ensemble with parameters of type Dataset

Modifier and Type

Method

Description

WeightedEnsembleModel<Label>

AdaBoostTrainer.train(Dataset<Label> examples)

If the trainer implements WeightedExamples then do boosting by weighting, otherwise do boosting by sampling.

WeightedEnsembleModel<Label>

AdaBoostTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

If the trainer implements WeightedExamples then do boosting by weighting, otherwise do boosting by sampling.

WeightedEnsembleModel<Label>

AdaBoostTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.classification.example

Methods in org.tribuo.classification.example that return types with arguments of type Dataset

Modifier and Type

Method

Description

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.binarySparseTrainTest()

Generates a pair of datasets with sparse features and unknown features in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.binarySparseTrainTest()

Generates a pair of datasets with sparse features and unknown features in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.binarySparseTrainTest(double negate)

Generates a pair of datasets with sparse features and unknown features in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.binarySparseTrainTest(double negate)

Generates a pair of datasets with sparse features and unknown features in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.denseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 classes, {Foo,Bar,Baz,Quux}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.denseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 classes, {Foo,Bar,Baz,Quux}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 classes, {Foo,Bar,Baz,Quux}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 classes, {Foo,Bar,Baz,Quux}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.sparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.sparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Label>, Dataset<Label>>

LabelledDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.
Uses of Dataset in org.tribuo.classification.experiments

Methods in org.tribuo.classification.experiments that return types with arguments of type Dataset

Modifier and Type

Method

Description

static com.oracle.labs.mlrg.olcut.util.Pair<Model<Label>, Dataset<Label>>

Test.load(Test.ConfigurableTestOptions o)

Loads in the model and the dataset from the options.
Uses of Dataset in org.tribuo.classification.fs

Methods in org.tribuo.classification.fs with parameters of type Dataset

Modifier and Type

Method

Description

SelectedFeatureSet

CMIM.select(Dataset<Label> dataset)

SelectedFeatureSet

JMI.select(Dataset<Label> dataset)

SelectedFeatureSet

MIM.select(Dataset<Label> dataset)

SelectedFeatureSet

mRMR.select(Dataset<Label> dataset)
Uses of Dataset in org.tribuo.classification.liblinear

Methods in org.tribuo.classification.liblinear with parameters of type Dataset

Modifier and Type

Method

Description

protected com.oracle.labs.mlrg.olcut.util.Pair<de.bwaldvogel.liblinear.FeatureNode[][], double[][]>

LibLinearClassificationTrainer.extractData(Dataset<Label> data, ImmutableOutputInfo<Label> outputInfo, ImmutableFeatureMap featureMap)
Uses of Dataset in org.tribuo.classification.libsvm

Methods in org.tribuo.classification.libsvm with parameters of type Dataset

Modifier and Type

Method

Description

protected com.oracle.labs.mlrg.olcut.util.Pair<libsvm.svm_node[][], double[][]>

LibSVMClassificationTrainer.extractData(Dataset<Label> data, ImmutableOutputInfo<Label> outputInfo, ImmutableFeatureMap featureMap)
Uses of Dataset in org.tribuo.classification.mnb

Methods in org.tribuo.classification.mnb with parameters of type Dataset

Modifier and Type

Method

Description

Model<Label>

MultinomialNaiveBayesTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

Model<Label>

MultinomialNaiveBayesTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.classification.sgd.kernel

Methods in org.tribuo.classification.sgd.kernel with parameters of type Dataset

Modifier and Type

Method

Description

KernelSVMModel

KernelSVMTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

KernelSVMModel

KernelSVMTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.classification.xgboost

Methods in org.tribuo.classification.xgboost with parameters of type Dataset

Modifier and Type

Method

Description

XGBoostModel<Label>

XGBoostClassificationTrainer.train(Dataset<Label> examples)

XGBoostModel<Label>

XGBoostClassificationTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

XGBoostModel<Label>

XGBoostClassificationTrainer.train(Dataset<Label> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.clustering.example

Methods in org.tribuo.clustering.example that return Dataset

Modifier and Type

Method

Description

static Dataset<ClusterID>

ClusteringDataGenerator.gaussianClusters(long size, long seed)

Generates a dataset drawn from a mixture of 5 2d gaussians.

Methods in org.tribuo.clustering.example that return types with arguments of type Dataset

Modifier and Type

Method

Description

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.denseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.denseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}, and there are 4 clusters, {0,1,2,3}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.sparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.sparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<ClusterID>, Dataset<ClusterID>>

ClusteringDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.
Uses of Dataset in org.tribuo.clustering.hdbscan

Methods in org.tribuo.clustering.hdbscan with parameters of type Dataset

Modifier and Type

Method

Description

HdbscanModel

HdbscanTrainer.train(Dataset<ClusterID> dataset)

HdbscanModel

HdbscanTrainer.train(Dataset<ClusterID> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)
Uses of Dataset in org.tribuo.clustering.kmeans

Methods in org.tribuo.clustering.kmeans with parameters of type Dataset

Modifier and Type

Method

Description

KMeansModel

KMeansTrainer.train(Dataset<ClusterID> dataset)

KMeansModel

KMeansTrainer.train(Dataset<ClusterID> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

KMeansModel

KMeansTrainer.train(Dataset<ClusterID> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.common.liblinear

Methods in org.tribuo.common.liblinear with parameters of type Dataset

Modifier and Type

Method

Description

protected abstract com.oracle.labs.mlrg.olcut.util.Pair<de.bwaldvogel.liblinear.FeatureNode[][], double[][]>

LibLinearTrainer.extractData(Dataset<T> data, ImmutableOutputInfo<T> outputInfo, ImmutableFeatureMap featureMap)

Extracts the features and Outputs in LibLinear's format.

LibLinearModel<T>

LibLinearTrainer.train(Dataset<T> examples)

LibLinearModel<T>

LibLinearTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

LibLinearModel<T>

LibLinearTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.common.libsvm

Methods in org.tribuo.common.libsvm with parameters of type Dataset

Modifier and Type

Method

Description

protected abstract com.oracle.labs.mlrg.olcut.util.Pair<libsvm.svm_node[][], double[][]>

LibSVMTrainer.extractData(Dataset<T> data, ImmutableOutputInfo<T> outputInfo, ImmutableFeatureMap featureMap)

Extracts the features and Outputs in LibSVM's format.

LibSVMModel<T>

LibSVMTrainer.train(Dataset<T> examples)

LibSVMModel<T>

LibSVMTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

LibSVMModel<T>

LibSVMTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.common.nearest

Methods in org.tribuo.common.nearest with parameters of type Dataset

Modifier and Type

Method

Description

Model<T>

KNNTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

Model<T>

KNNTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.common.sgd

Methods in org.tribuo.common.sgd with parameters of type Dataset

Modifier and Type

Method

Description

V

AbstractSGDTrainer.train(Dataset<T> examples)

V

AbstractSGDTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

V

AbstractSGDTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.common.tree

Methods in org.tribuo.common.tree with parameters of type Dataset

Modifier and Type

Method

Description

protected abstract AbstractTrainingNode<T>

AbstractCARTTrainer.mkTrainingNode(Dataset<T> examples, AbstractTrainingNode.LeafDeterminer leafDeterminer)

Makes the initial training node.

TreeModel<T>

AbstractCARTTrainer.train(Dataset<T> examples)

TreeModel<T>

AbstractCARTTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

TreeModel<T>

AbstractCARTTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.common.xgboost

Methods in org.tribuo.common.xgboost with parameters of type Dataset

Modifier and Type

Method

Description

protected static <T extends Output<T>> XGBoostTrainer.DMatrixTuple<T>

XGBoostTrainer.convertDataset(Dataset<T> examples)

Converts a dataset into a DMatrix.

protected static <T extends Output<T>> XGBoostTrainer.DMatrixTuple<T>

XGBoostTrainer.convertDataset(Dataset<T> examples, Function<T,Float> responseExtractor)

Converts a dataset into a DMatrix.

List<Prediction<T>>

XGBoostModel.predict(Dataset<T> examples)

Uses the model to predict the labels for multiple examples contained in a data set.
Uses of Dataset in org.tribuo.data

Methods in org.tribuo.data that return types with arguments of type Dataset

Modifier and Type

Method

Description

<T extends Output<T>> com.oracle.labs.mlrg.olcut.util.Pair<Dataset<T>, Dataset<T>>

DataOptions.load(OutputFactory<T> outputFactory)

Loads the training and testing data from DataOptions.trainingPath and DataOptions.testingPath according to the other parameters specified in this class.

<T extends Output<T>> com.oracle.labs.mlrg.olcut.util.Pair<Dataset<T>, Dataset<T>>

DataOptions.load(OutputFactory<T> outputFactory)

Loads the training and testing data from DataOptions.trainingPath and DataOptions.testingPath according to the other parameters specified in this class.
Uses of Dataset in org.tribuo.data.csv

Methods in org.tribuo.data.csv with parameters of type Dataset

Modifier and Type

Method

Description

<T extends Output<T>> void

CSVSaver.save(Path csvPath, Dataset<T> dataset, String responseName)

Saves the dataset to the specified path.

<T extends Output<T>> void

CSVSaver.save(Path csvPath, Dataset<T> dataset, Set<String> responseNames)

Saves the dataset to the specified path.
Uses of Dataset in org.tribuo.dataset

Subclasses of Dataset in org.tribuo.dataset

Modifier and Type

Class

Description

final class

DatasetView<T extends Output<T>>

DatasetView provides an immutable view on another Dataset that only exposes selected examples.

class

MinimumCardinalityDataset<T extends Output<T>>

This class creates a pruned dataset in which low frequency features that occur less than the provided minimum cardinality have been removed.

final class

SelectedFeatureDataset<T extends Output<T>>

This class creates a pruned dataset which only contains the selected features.

Methods in org.tribuo.dataset with parameters of type Dataset

Modifier and Type

Method

Description

static <T extends Output<T>> DatasetView<T>

DatasetView.createBootstrapView(Dataset<T> dataset, int size, long seed)

Generates a DatasetView bootstrapped from the supplied Dataset.

static <T extends Output<T>> DatasetView<T>

DatasetView.createBootstrapView(Dataset<T> dataset, int size, long seed, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> outputIDs)

Generates a DatasetView bootstrapped from the supplied Dataset.

static <T extends Output<T>> DatasetView<T>

DatasetView.createView(Dataset<T> dataset, Predicate<Example<T>> predicate, String tag)

Creates a view from the supplied dataset, using the specified predicate to test if each example should be in this view.

static <T extends Output<T>> DatasetView<T>

DatasetView.createWeightedBootstrapView(Dataset<T> dataset, int size, long seed, float[] exampleWeights)

Generates a DatasetView bootstrapped from the supplied Dataset using the supplied example weights.

static <T extends Output<T>> DatasetView<T>

DatasetView.createWeightedBootstrapView(Dataset<T> dataset, int size, long seed, float[] exampleWeights, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> outputIDs)

Generates a DatasetView bootstrapped from the supplied Dataset using the supplied example weights.

Constructors in org.tribuo.dataset with parameters of type Dataset

Modifier

Constructor

Description

DatasetView(Dataset<T> dataset, int[] exampleIndices, String tag)

Creates a DatasetView which includes the supplied indices from the dataset.

DatasetView(Dataset<T> dataset, int[] exampleIndices, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> labelIDs, String tag)

Creates a DatasetView which includes the supplied indices from the dataset.

MinimumCardinalityDataset(Dataset<T> dataset, int minCardinality)

SelectedFeatureDataset(Dataset<T> dataset, SelectedFeatureSet featureSet)

Constructs a selected feature dataset using all the features in the supplied feature set.

SelectedFeatureDataset(Dataset<T> dataset, SelectedFeatureSet featureSet, int k)

Constructs a selected feature dataset.
Uses of Dataset in org.tribuo.datasource

Methods in org.tribuo.datasource with parameters of type Dataset

Modifier and Type

Method

Description

static <T extends Output<T>> void

LibSVMDataSource.writeLibSVMFormat(Dataset<T> dataset, PrintStream out, boolean zeroIndexed, Function<T,Number> transformationFunc)

Writes out a dataset in LibSVM format.
Uses of Dataset in org.tribuo.ensemble

Methods in org.tribuo.ensemble with parameters of type Dataset

Modifier and Type

Method

Description

EnsembleModel<T>

BaggingTrainer.train(Dataset<T> examples)

EnsembleModel<T>

BaggingTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

EnsembleModel<T>

BaggingTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

protected Model<T>

BaggingTrainer.trainSingleModel(Dataset<T> examples, ImmutableFeatureMap featureIDs, ImmutableOutputInfo<T> labelIDs, int randInt, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

Trains a single model.
Uses of Dataset in org.tribuo.evaluation

Methods in org.tribuo.evaluation with parameters of type Dataset

Modifier and Type

Method

Description

static <T extends Output<T>, C extends MetricContext<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double>

EvaluationAggregator.argmax(EvaluationMetric<T,C> metric, List<? extends Model<T>> models, Dataset<T> dataset)

Calculates the argmax of a metric across the supplied models (i.e., the index of the model which performed the best).

final E

AbstractEvaluator.evaluate(Model<T> model, Dataset<T> dataset)

Produces an evaluation for the supplied model and dataset, by calling Model.predict(org.tribuo.Example<T>) to create the predictions, then aggregating the appropriate statistics.

E

Evaluator.evaluate(Model<T> model, Dataset<T> dataset)

Evaluates the dataset using the supplied model, returning an immutable Evaluation of the appropriate type.

Iterator<KFoldSplitter.TrainTestFold<T>>

KFoldSplitter.split(Dataset<T> dataset, boolean shuffle)

Splits a dataset into k consecutive folds; for each fold, the remaining k-1 folds form the training set.

static <T extends Output<T>, C extends MetricContext<T>> DescriptiveStats

EvaluationAggregator.summarize(List<? extends EvaluationMetric<T,C>> metrics, Model<T> model, Dataset<T> dataset)

Summarize model performance on dataset across several metrics.

static <T extends Output<T>, R extends Evaluation<T>> Map<MetricID<T>, DescriptiveStats>

EvaluationAggregator.summarize(Evaluator<T,R> evaluator, List<? extends Model<T>> models, Dataset<T> dataset)

Summarize performance using the supplied evaluator across several models on one dataset.

static <T extends Output<T>, C extends MetricContext<T>> DescriptiveStats

EvaluationAggregator.summarize(EvaluationMetric<T,C> metric, List<? extends Model<T>> models, Dataset<T> dataset)

Summarize performance w.r.t.

Method parameters in org.tribuo.evaluation with type arguments of type Dataset

Modifier and Type

Method

Description

static <T extends Output<T>, C extends MetricContext<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double>

EvaluationAggregator.argmax(EvaluationMetric<T,C> metric, Model<T> model, List<? extends Dataset<T>> datasets)

Calculates the argmax of a metric across the supplied datasets.

static <T extends Output<T>, R extends Evaluation<T>> Map<MetricID<T>, DescriptiveStats>

EvaluationAggregator.summarize(Evaluator<T,R> evaluator, Model<T> model, List<? extends Dataset<T>> datasets)

Summarize performance according to evaluator for a single model across several datasets.

static <T extends Output<T>, C extends MetricContext<T>> DescriptiveStats

EvaluationAggregator.summarize(EvaluationMetric<T,C> metric, Model<T> model, List<? extends Dataset<T>> datasets)

Summarize a model's performance w.r.t.

Constructors in org.tribuo.evaluation with parameters of type Dataset

Modifier

Constructor

Description

CrossValidation(Trainer<T> trainer, Dataset<T> data, Evaluator<T,E> evaluator, int k)

Builds a k-fold cross-validation loop.

CrossValidation(Trainer<T> trainer, Dataset<T> data, Evaluator<T,E> evaluator, int k, long seed)

Builds a k-fold cross-validation loop.
Uses of Dataset in org.tribuo.evaluation.metrics

Methods in org.tribuo.evaluation.metrics with parameters of type Dataset

Modifier and Type

Method

Description

default C

EvaluationMetric.createContext(Model<T> model, Dataset<T> dataset)

Creates the metric context used to compute this metric's value, generating Predictions for each Example in the supplied dataset.
Uses of Dataset in org.tribuo.hash

Methods in org.tribuo.hash with parameters of type Dataset

Modifier and Type

Method

Description

Model<T>

HashingTrainer.train(Dataset<T> dataset, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance)

This clones the Dataset, hashes each of the examples and rewrites their feature ids before passing it to the inner trainer.

Model<T>

HashingTrainer.train(Dataset<T> dataset, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance, int invocationCount)
Uses of Dataset in org.tribuo.interop.tensorflow

Methods in org.tribuo.interop.tensorflow with parameters of type Dataset

Modifier and Type

Method

Description

TensorFlowModel<T>

TensorFlowTrainer.train(Dataset<T> examples)

TensorFlowModel<T>

TensorFlowTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

TensorFlowModel<T>

TensorFlowTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.math.la

Methods in org.tribuo.math.la with parameters of type Dataset

Modifier and Type

Method

Description

static <T extends Output<T>> SparseVector[]

SparseVector.transpose(Dataset<T> dataset)

Converts a dataset of row-major examples into an array of column-major sparse vectors.

static <T extends Output<T>> SparseVector[]

SparseVector.transpose(Dataset<T> dataset, ImmutableFeatureMap fMap)

Converts a dataset of row-major examples into an array of column-major sparse vectors.
Uses of Dataset in org.tribuo.multilabel.baseline

Methods in org.tribuo.multilabel.baseline with parameters of type Dataset

Modifier and Type

Method

Description

ClassifierChainModel

ClassifierChainTrainer.train(Dataset<MultiLabel> examples)

ClassifierChainModel

ClassifierChainTrainer.train(Dataset<MultiLabel> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

ClassifierChainModel

ClassifierChainTrainer.train(Dataset<MultiLabel> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

Model<MultiLabel>

IndependentMultiLabelTrainer.train(Dataset<MultiLabel> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

Model<MultiLabel>

IndependentMultiLabelTrainer.train(Dataset<MultiLabel> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.multilabel.ensemble

Methods in org.tribuo.multilabel.ensemble with parameters of type Dataset

Modifier and Type

Method

Description

WeightedEnsembleModel<MultiLabel>

CCEnsembleTrainer.train(Dataset<MultiLabel> examples)

WeightedEnsembleModel<MultiLabel>

CCEnsembleTrainer.train(Dataset<MultiLabel> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

WeightedEnsembleModel<MultiLabel>

CCEnsembleTrainer.train(Dataset<MultiLabel> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.multilabel.example

Methods in org.tribuo.multilabel.example that return Dataset

Modifier and Type

Method

Description

static Dataset<MultiLabel>

MultiLabelDataGenerator.generateTestData()

Simple test data for checking multi-label trainers.

static Dataset<MultiLabel>

MultiLabelDataGenerator.generateTrainData()

Simple training data for checking multi-label trainers.

Methods in org.tribuo.multilabel.example that return types with arguments of type Dataset

Modifier and Type

Method

Description

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<MultiLabel>, Dataset<MultiLabel>>

MultiLabelDataGenerator.generateDataset()

Generate training and testing datasets.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<MultiLabel>, Dataset<MultiLabel>>

MultiLabelDataGenerator.generateDataset()

Generate training and testing datasets.
Uses of Dataset in org.tribuo.provenance

Constructors in org.tribuo.provenance with parameters of type Dataset

Modifier

Constructor

Description

<T extends Output<T>>

DatasetProvenance(DataProvenance sourceProvenance, com.oracle.labs.mlrg.olcut.provenance.ListProvenance<com.oracle.labs.mlrg.olcut.provenance.ObjectProvenance> transformationProvenance, Dataset<T> dataset)

Creates a dataset provenance from the supplied dataset.
Uses of Dataset in org.tribuo.regression.baseline

Methods in org.tribuo.regression.baseline with parameters of type Dataset

Modifier and Type

Method

Description

DummyRegressionModel

DummyRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance)

DummyRegressionModel

DummyRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance, int invocationCount)
Uses of Dataset in org.tribuo.regression.example

Methods in org.tribuo.regression.example that return types with arguments of type Dataset

Modifier and Type

Method

Description

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.denseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.denseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.denseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimDenseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimDenseTrainTest()

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimDenseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimDenseTrainTest(double negate)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimSparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimSparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimSparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.multiDimSparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.sparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.sparseTrainTest()

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.sparseTrainTest(double negate)

Generates a pair of datasets, where the features are sparse, and unknown features appear in the test data.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.threeDimDenseTrainTest(double negate, boolean remapIndices)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.

static com.oracle.labs.mlrg.olcut.util.Pair<Dataset<Regressor>, Dataset<Regressor>>

RegressionDataGenerator.threeDimDenseTrainTest(double negate, boolean remapIndices)

Generates a train/test dataset pair which is dense in the features, each example has 4 features,{A,B,C,D}.
Uses of Dataset in org.tribuo.regression.impl

Methods in org.tribuo.regression.impl with parameters of type Dataset

Modifier and Type

Method

Description

SkeletalIndependentRegressionSparseModel

SkeletalIndependentRegressionSparseTrainer.train(Dataset<Regressor> examples)

SkeletalIndependentRegressionSparseModel

SkeletalIndependentRegressionSparseTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

SkeletalIndependentRegressionSparseModel

SkeletalIndependentRegressionSparseTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

SkeletalIndependentRegressionModel

SkeletalIndependentRegressionTrainer.train(Dataset<Regressor> examples)

SkeletalIndependentRegressionModel

SkeletalIndependentRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

SkeletalIndependentRegressionModel

SkeletalIndependentRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.regression.liblinear

Methods in org.tribuo.regression.liblinear with parameters of type Dataset

Modifier and Type

Method

Description

protected com.oracle.labs.mlrg.olcut.util.Pair<de.bwaldvogel.liblinear.FeatureNode[][], double[][]>

LibLinearRegressionTrainer.extractData(Dataset<Regressor> data, ImmutableOutputInfo<Regressor> outputInfo, ImmutableFeatureMap featureMap)
Uses of Dataset in org.tribuo.regression.libsvm

Methods in org.tribuo.regression.libsvm with parameters of type Dataset

Modifier and Type

Method

Description

protected com.oracle.labs.mlrg.olcut.util.Pair<libsvm.svm_node[][], double[][]>

LibSVMRegressionTrainer.extractData(Dataset<Regressor> data, ImmutableOutputInfo<Regressor> outputInfo, ImmutableFeatureMap featureMap)
Uses of Dataset in org.tribuo.regression.rtree

Methods in org.tribuo.regression.rtree with parameters of type Dataset

Modifier and Type

Method

Description

protected AbstractTrainingNode<Regressor>

CARTJointRegressionTrainer.mkTrainingNode(Dataset<Regressor> examples, AbstractTrainingNode.LeafDeterminer leafDeterminer)

protected AbstractTrainingNode<Regressor>

CARTRegressionTrainer.mkTrainingNode(Dataset<Regressor> examples, AbstractTrainingNode.LeafDeterminer leafDeterminer)

TreeModel<Regressor>

CARTRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

TreeModel<Regressor>

CARTRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.regression.rtree.impl

Methods in org.tribuo.regression.rtree.impl with parameters of type Dataset

Modifier and Type

Method

Description

static RegressorTrainingNode.InvertedData

RegressorTrainingNode.invertData(Dataset<Regressor> examples)

Inverts a training dataset from row major to column major.

Constructors in org.tribuo.regression.rtree.impl with parameters of type Dataset

Modifier

Constructor

Description

JointRegressorTrainingNode(RegressorImpurity impurity, Dataset<Regressor> examples, boolean normalize, AbstractTrainingNode.LeafDeterminer leafDeterminer)

Constructor which creates the inverted file.
Uses of Dataset in org.tribuo.regression.slm

Methods in org.tribuo.regression.slm with parameters of type Dataset

Modifier and Type

Method

Description

SparseModel<Regressor>

ElasticNetCDTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

SparseModel<Regressor>

ElasticNetCDTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

SparseLinearModel

SLMTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

Trains a sparse linear model.

SparseLinearModel

SLMTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)

Trains a sparse linear model.
Uses of Dataset in org.tribuo.regression.xgboost

Methods in org.tribuo.regression.xgboost with parameters of type Dataset

Modifier and Type

Method

Description

XGBoostModel<Regressor>

XGBoostRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)

XGBoostModel<Regressor>

XGBoostRegressionTrainer.train(Dataset<Regressor> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance, int invocationCount)
Uses of Dataset in org.tribuo.reproducibility

Methods in org.tribuo.reproducibility that return Dataset

Modifier and Type

Method

Description

Dataset<T>

ReproUtil.recoverDataset()

Return a Dataset used when a model was trained.
Uses of Dataset in org.tribuo.sequence

Methods in org.tribuo.sequence that return Dataset

Modifier and Type

Method

Description

Dataset<T>

SequenceDataset.getFlatDataset()

Returns a view on this SequenceDataset which aggregates all the examples and ignores the sequence structure.
Uses of Dataset in org.tribuo.transform

Methods in org.tribuo.transform with parameters of type Dataset

Modifier and Type

Method

Description

List<Prediction<T>>

TransformedModel.predict(Dataset<T> examples)

TransformedModel<T>

TransformTrainer.train(Dataset<T> examples)

TransformedModel<T>

TransformTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance)

TransformedModel<T>

TransformTrainer.train(Dataset<T> examples, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance, int invocationCount)

<T extends Output<T>> MutableDataset<T>

TransformerMap.transformDataset(Dataset<T> dataset)

Copies the supplied dataset and applies the transformers to each example in it.

<T extends Output<T>> MutableDataset<T>

TransformerMap.transformDataset(Dataset<T> dataset, boolean densify)

Copies the supplied dataset and applies the transformers to each example in it.

Uses of Classorg.tribuo.Dataset

Uses of Dataset in org.tribuo

Uses of Dataset in org.tribuo.anomaly.example

Uses of Dataset in org.tribuo.anomaly.liblinear

Uses of Dataset in org.tribuo.anomaly.libsvm

Uses of Dataset in org.tribuo.classification.baseline

Uses of Dataset in org.tribuo.classification.dtree

Uses of Dataset in org.tribuo.classification.dtree.impl

Uses of Dataset in org.tribuo.classification.ensemble

Uses of Dataset in org.tribuo.classification.example

Uses of Dataset in org.tribuo.classification.experiments

Uses of Dataset in org.tribuo.classification.fs

Uses of Dataset in org.tribuo.classification.liblinear

Uses of Dataset in org.tribuo.classification.libsvm

Uses of Dataset in org.tribuo.classification.mnb

Uses of Dataset in org.tribuo.classification.sgd.kernel

Uses of Dataset in org.tribuo.classification.xgboost

Uses of Dataset in org.tribuo.clustering.example

Uses of Dataset in org.tribuo.clustering.hdbscan

Uses of Dataset in org.tribuo.clustering.kmeans

Uses of Dataset in org.tribuo.common.liblinear

Uses of Dataset in org.tribuo.common.libsvm

Uses of Dataset in org.tribuo.common.nearest

Uses of Dataset in org.tribuo.common.sgd

Uses of Dataset in org.tribuo.common.tree

Uses of Dataset in org.tribuo.common.xgboost

Uses of Dataset in org.tribuo.data

Uses of Dataset in org.tribuo.data.csv

Uses of Dataset in org.tribuo.dataset

Uses of Dataset in org.tribuo.datasource

Uses of Dataset in org.tribuo.ensemble

Uses of Dataset in org.tribuo.evaluation

Uses of Dataset in org.tribuo.evaluation.metrics

Uses of Dataset in org.tribuo.hash

Uses of Dataset in org.tribuo.interop.tensorflow

Uses of Dataset in org.tribuo.math.la

Uses of Dataset in org.tribuo.multilabel.baseline

Uses of Dataset in org.tribuo.multilabel.ensemble

Uses of Dataset in org.tribuo.multilabel.example

Uses of Dataset in org.tribuo.provenance

Uses of Dataset in org.tribuo.regression.baseline

Uses of Dataset in org.tribuo.regression.example

Uses of Dataset in org.tribuo.regression.impl

Uses of Dataset in org.tribuo.regression.liblinear

Uses of Dataset in org.tribuo.regression.libsvm

Uses of Dataset in org.tribuo.regression.rtree

Uses of Dataset in org.tribuo.regression.rtree.impl

Uses of Dataset in org.tribuo.regression.slm

Uses of Dataset in org.tribuo.regression.xgboost

Uses of Dataset in org.tribuo.reproducibility

Uses of Dataset in org.tribuo.sequence

Uses of Dataset in org.tribuo.transform

Uses of Class
org.tribuo.Dataset