public final class XGBoostExternalModel<T extends Output<T>> extends ExternalModel<T,ml.dmlc.xgboost4j.java.DMatrix,float[][]>
Model
which wraps around a XGBoost.Booster which was trained by a system other than Tribuo.
XGBoost is a fast implementation of gradient boosted decision trees.
Throws IllegalStateException if the XGBoost C++ library fails to load or throws an exception.
See:
Chen T, Guestrin C. "XGBoost: A Scalable Tree Boosting System" Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.
and for the original algorithm:
Friedman JH. "Greedy Function Approximation: a Gradient Boosting Machine" Annals of statistics, 2001.
Note: XGBoost requires a native library, on macOS this library requires libomp (which can be installed via homebrew), on Windows this native library must be compiled into a jar as it's not contained in the official XGBoost binary on Maven Central.
Modifier and Type | Field and Description |
---|---|
protected ml.dmlc.xgboost4j.java.Booster |
model
Transient as we rely upon the native serialisation mechanism to bytes rather than Java serializing the Booster.
|
DEFAULT_BATCH_SIZE, featureBackwardMapping, featureForwardMapping
ALL_OUTPUTS, BIAS_FEATURE, featureIDMap, generatesProbabilities, name, outputIDInfo, provenance, provenanceOutput
Modifier and Type | Method and Description |
---|---|
protected ml.dmlc.xgboost4j.java.DMatrix |
convertFeatures(SparseVector input)
Converts from a SparseVector using the external model's indices into
the ingestion format for the external model.
|
protected ml.dmlc.xgboost4j.java.DMatrix |
convertFeaturesList(List<SparseVector> input)
Converts from a list of SparseVector using the external model's indices
into the ingestion format for the external model.
|
protected List<Prediction<T>> |
convertOutput(float[][] output,
int[] numValidFeatures,
List<Example<T>> examples)
Converts the output of the external model into a list of
Prediction s. |
protected Prediction<T> |
convertOutput(float[][] output,
int numValidFeatures,
Example<T> example)
Converts the output of the external model into a
Prediction . |
protected XGBoostExternalModel<T> |
copy(String newName,
ModelProvenance newProvenance)
Copies a model, replacing it's provenance and name with the supplied values.
|
static <T extends Output<T>> |
createXGBoostModel(OutputFactory<T> factory,
Map<String,Integer> featureMapping,
Map<T,Integer> outputMapping,
XGBoostOutputConverter<T> outputFunc,
ml.dmlc.xgboost4j.java.Booster model,
URL provenanceLocation)
Creates an
XGBoostExternalModel from the supplied model. |
static <T extends Output<T>> |
createXGBoostModel(OutputFactory<T> factory,
Map<String,Integer> featureMapping,
Map<T,Integer> outputMapping,
XGBoostOutputConverter<T> outputFunc,
Path path)
Creates an
XGBoostExternalModel from the supplied model on disk. |
static <T extends Output<T>> |
createXGBoostModel(OutputFactory<T> factory,
Map<String,Integer> featureMapping,
Map<T,Integer> outputMapping,
XGBoostOutputConverter<T> outputFunc,
String path)
Creates an
XGBoostExternalModel from the supplied model on disk. |
protected float[][] |
externalPrediction(ml.dmlc.xgboost4j.java.DMatrix input)
Runs the external model's prediction function.
|
Map<String,List<com.oracle.labs.mlrg.olcut.util.Pair<String,Double>>> |
getTopFeatures(int n)
Gets the top
n features associated with this model. |
createFeatureMap, createOutputInfo, getBatchSize, getExcuse, innerPredict, predict, setBatchSize
copy, generatesProbabilities, getExcuses, getFeatureIDMap, getName, getOutputIDInfo, getProvenance, predict, predict, setName, toString, validate
protected transient ml.dmlc.xgboost4j.java.Booster model
protected ml.dmlc.xgboost4j.java.DMatrix convertFeatures(SparseVector input)
ExternalModel
convertFeatures
in class ExternalModel<T extends Output<T>,ml.dmlc.xgboost4j.java.DMatrix,float[][]>
input
- The features using external indices.protected ml.dmlc.xgboost4j.java.DMatrix convertFeaturesList(List<SparseVector> input)
ExternalModel
convertFeaturesList
in class ExternalModel<T extends Output<T>,ml.dmlc.xgboost4j.java.DMatrix,float[][]>
input
- The features using external indices.protected float[][] externalPrediction(ml.dmlc.xgboost4j.java.DMatrix input)
ExternalModel
externalPrediction
in class ExternalModel<T extends Output<T>,ml.dmlc.xgboost4j.java.DMatrix,float[][]>
input
- The input in the external model's format.protected Prediction<T> convertOutput(float[][] output, int numValidFeatures, Example<T> example)
ExternalModel
Prediction
.convertOutput
in class ExternalModel<T extends Output<T>,ml.dmlc.xgboost4j.java.DMatrix,float[][]>
output
- The output of the external model.numValidFeatures
- The number of valid features in the input.example
- The input example, used to construct the Prediction.protected List<Prediction<T>> convertOutput(float[][] output, int[] numValidFeatures, List<Example<T>> examples)
ExternalModel
Prediction
s.convertOutput
in class ExternalModel<T extends Output<T>,ml.dmlc.xgboost4j.java.DMatrix,float[][]>
output
- The output of the external model.numValidFeatures
- An array with the number of valid features in each example.examples
- The input examples, used to construct the Predictions.public Map<String,List<com.oracle.labs.mlrg.olcut.util.Pair<String,Double>>> getTopFeatures(int n)
Model
n
features associated with this model.
If the model does not produce per output feature lists, it returns a map with a single element with key Model.ALL_OUTPUTS.
If the model cannot describe it's top features then it returns Collections.emptyMap()
.
getTopFeatures
in class Model<T extends Output<T>>
n
- the number of features to return. If this value is less than 0,
all features should be returned for each class, unless the model cannot score it's features.protected XGBoostExternalModel<T> copy(String newName, ModelProvenance newProvenance)
Model
Used to provide the provenance removal functionality.
public static <T extends Output<T>> XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String,Integer> featureMapping, Map<T,Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, String path)
XGBoostExternalModel
from the supplied model on disk.T
- The type of the output.factory
- The output factory to use.featureMapping
- The feature mapping between Tribuo names and XGBoost integer ids.outputMapping
- The output mapping between Tribuo outputs and XGBoost integer ids.outputFunc
- The XGBoostOutputConverter function for the output type.path
- The path to the model on disk.public static <T extends Output<T>> XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String,Integer> featureMapping, Map<T,Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, Path path)
XGBoostExternalModel
from the supplied model on disk.T
- The type of the output.factory
- The output factory to use.featureMapping
- The feature mapping between Tribuo names and XGBoost integer ids.outputMapping
- The output mapping between Tribuo outputs and XGBoost integer ids.outputFunc
- The XGBoostOutputConverter function for the output type.path
- The path to the model on disk.public static <T extends Output<T>> XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String,Integer> featureMapping, Map<T,Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, ml.dmlc.xgboost4j.java.Booster model, URL provenanceLocation)
XGBoostExternalModel
from the supplied model.
Note: the provenance system requires that the URL point to a valid local file and
will throw an exception if it is not. However it doesn't check that the file is
where the Booster was created from.
We will replace this entry point with one that accepts useful provenance information
for an in-memory Booster
object in a future release, and deprecate this
endpoint at that time.
T
- The type of the output.factory
- The output factory to use.featureMapping
- The feature mapping between Tribuo names and XGBoost integer ids.outputMapping
- The output mapping between Tribuo outputs and XGBoost integer ids.outputFunc
- The XGBoostOutputConverter function for the output type.model
- The XGBoost model to wrap.provenanceLocation
- The location where the model was loaded from.Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.