Tribuo 4.2.2 API

Tribuo is a library for building and deploying Machine Learning models. It provides classification, regression, clustering, anomaly detection and multi-label classification algorithms. Tribuo is strongly typed, with most of the relevant types living in the org.tribuo package. Task specific types live in the root package of that task (e.g., org.tribuo.classification). Tribuo is modular, each package/module is scoped to have minimal dependencies, and there are few cross-cutting packages. Tribuo's development is lead by Oracle Labs' Machine Learning Research Group, the source is hosted on Github, and more documentation is available on the project website tribuo.org.
Package
Description
 
Provides the core interfaces and classes for using Tribuo.
Provides classes and infrastructure for anomaly detection problems.
Evaluation classes for anomaly detection.
Provides anomaly data generators used for demos and testing implementations.
Provides an interface to LibLinear-java for anomaly detection problems.
Provides an interface to LibSVM for anomaly detection problems.
Provides classes and infrastructure for multiclass classification problems.
Provides simple baseline multiclass classifiers.
Provides implementations of decision trees for classification problems.
Provides internal implementation classes for classification decision trees.
Provides classification impurity metrics for decision trees.
Provides majority vote ensemble combiners for classification along with an implementation of multiclass Adaboost.
Evaluation classes for multi-class classification.
Provides a multiclass data generator used for testing implementations, along with several synthetic data generators for 2d binary classification problems to be used in demos or tutorials.
Provides a set of main methods for interacting with classification tasks.
Provides core infrastructure for local model based explanations.
Provides an implementation of LIME (Locally Interpretable Model Explanations).
Provides an interface to LibLinear-java for classification problems.
Provides an interface to LibSVM for classification problems.
Provides an implementation of multinomial naive bayes (i.e., naive bayes for non-negative count data).
Provides infrastructure for SequenceModels which emit Labels at each step of the sequence.
Provides a classification sequence data generator for smoke testing implementations.
Provides an implementation of Viterbi for generating structured outputs, which can sit on top of any Label based classification model.
Provides infrastructure for Stochastic Gradient Descent for classification problems.
Provides an implementation of a linear chain CRF trained using Stochastic Gradient Descent.
Provides an implementation of a classification factorization machine using Stochastic Gradient Descent.
Provides a SGD implementation of a Kernel SVM using the Pegasos algorithm.
Provides an implementation of a classification linear model using Stochastic Gradient Descent.
Provides classification loss functions for Stochastic Gradient Descent.
Provides an interface to XGBoost for classification problems.
Provides classes and infrastructure for working with clustering problems.
Evaluation classes for clustering.
Provides clustering data generators used for demos and testing implementations.
Provides an implementation of HDBSCAN*.
Provides a multithreaded implementation of K-Means, with a configurable distance function.
Provides base classes for using liblinear from Tribuo.
The base interface to LibSVM.
Provides a K-Nearest Neighbours implementation which works across all Tribuo Output types.
Provides the base classes for models trained with stochastic gradient descent.
Provides common functionality for building decision trees, irrespective of the predicted Output.
Provides internal implementation classes for building decision trees.
Provides abstract classes for interfacing with XGBoost abstracting away all the Output dependent parts.
Provides classes for loading in data from disk, processing it into examples, and splitting datasets for things like cross-validation and train-test splits.
Provides classes for processing columnar data and generating Examples.
Provides implementations of FieldExtractor.
Provides implementations of FeatureProcessor.
Provides implementations of FieldProcessor.
Provides implementations of ResponseProcessor.
Provides classes which can load columnar data (using a RowProcessor) from a CSV (or other character delimited format) file.
Provides classes which can load columnar data (using a RowProcessor) from a SQL source.
Provides interfaces for converting text inputs into Features and Examples.
Provides implementations of text data processors.
Provides utility datasets which subsample or otherwise transform the wrapped dataset.
Simple data sources for ingesting or aggregating data.
Provides an interface for model prediction combinations, two base classes for ensemble models, a base class for ensemble excuses, and a Bagging implementation.
Evaluation base classes, along with code for train/test splits and cross validation.
This package contains the infrastructure classes for building evaluation metrics.
Provides the base interface and implementations of the Model hashing which obscures the feature names stored in a model.
Provides implementations of base classes and interfaces from org.tribuo.
This package contains the abstract implementation of an external model trained by something outside of Tribuo.
Code for uploading models to Oracle Cloud Infrastructure Data Science, and also for scoring models deployed in Oracle Cloud Infrastructure Data Science.
This package contains a Tribuo wrapper around the ONNX Runtime.
Provides feature extraction implementations which use ONNX models.
Provides an interface to TensorFlow, allowing the training of non-sequential models using any supported Tribuo output type.
Example architectures for use with Tribuo's TF interface.
Provides an interface for working with TensorFlow sequence models, using Tribuo's SequenceModel abstraction.
Provides interop with JSON formatted data, along with tools for interacting with JSON provenance objects.
Contains the implementation of Tribuo's math library, it's gradient descent optimisers, kernels and a set of math related utils.
Provides a Kernel interface for Mercer kernels, along with implementations of standard kernels.
Provides a linear algebra system used for numerical operations in Tribuo.
 
Provides implementations of StochasticGradientOptimiser.
Provides some utility tensors for use in gradient optimisers.
Provides math related util classes.
Provides classes and infrastructure for working with multi-label classification problems.
Provides implementations of binary relevance based multi-label classification algorithms.
Provides a multi-label ensemble combiner that performs a (possibly weighted) majority vote among each label independently, along with an implementation of classifier chain ensembles.
Evaluation classes for multi-label classification using MultiLabel.
Provides a multi-label data generator for testing implementations and a configurable data source suitable for demos and tests.
Provides infrastructure for Stochastic Gradient Descent for multi-label classification problems.
Provides an implementation of a multi-label classification factorization machine model using Stochastic Gradient Descent.
Provides an implementation of a multi-label classification linear model using Stochastic Gradient Descent.
Provides multi-label classification loss functions for Stochastic Gradient Descent.
Provides Tribuo specific infrastructure for the Provenance system which tracks models and datasets.
Provides internal implementations for empty provenance classes and TrainerProvenance.
Provides classes and infrastructure for regression problems with single or multiple output dimensions.
Provides simple baseline regression predictors.
Provides EnsembleCombiner implementations for working with multi-output regression problems.
Evaluation classes for single or multi-dimensional regression.
Provides some example regression data generators for testing implementations.
Provides skeletal implementations of Regressor Trainer that can wrap a single dimension trainer/model and produce one prediction per dimension independently.
Provides an interface to liblinear for regression problems.
Provides an interface to LibSVM for regression problems.
Provides an implementation of decision trees for regression problems.
Provides internal implementation classes for the regression trees.
Provides implementations of regression tree impurity metrics.
Provides infrastructure for Stochastic Gradient Descent based regression models.
Provides an implementation of factorization machines for regression using Stochastic Gradient Descent.
Provides an implementation of linear regression using Stochastic Gradient Descent.
Provides regression loss functions for Stochastic Gradient Descent.
Provides implementations of sparse linear regression using various forms of regularisation penalty.
Provides an interface to XGBoost for regression problems.
Reproducibility utility based on Tribuo's provenance objects.
Provides core classes for working with sequences of Examples.
This package provides helper classes for Tribuo's unit tests.
Provides infrastructure for applying transformations to a Dataset.
Provides implementations of standard transformations like binning, scaling, taking logs and exponents.
Provides utilities which don't have other Tribuo dependencies.
This package provides static classes of information theoretic functions.
This package provides demos for the information theoretic function classes in org.tribuo.util.infotheory.
This package provides the implementations and helper classes for the information theoretic functions in org.tribuo.util.infotheory.
Interfaces and utilities for writing ONNX models from Java.
Core definitions for tokenization.
Simple fixed rule tokenizers.
Provides an implementation of a Wordpiece tokenizer which implements to the Tribuo Tokenizer API.
OLCUT Options implementations which can construct Tokenizers of various types.
An implementation of a "universal" tokenizer which will split on word boundaries or character boundaries for languages where word boundaries are contextual.