Class XGBoostExternalModel<T extends Output<T>>
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.provenance.Provenancable<ModelProvenance>,Serializable
Model which wraps around a XGBoost.Booster which was trained by a system other than Tribuo.
XGBoost is a fast implementation of gradient boosted decision trees.
Throws IllegalStateException if the XGBoost C++ library fails to load or throws an exception.
See:
Chen T, Guestrin C. "XGBoost: A Scalable Tree Boosting System" Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.
and for the original algorithm:
Friedman JH. "Greedy Function Approximation: a Gradient Boosting Machine" Annals of statistics, 2001.
Note: XGBoost requires a native library, on macOS this library requires libomp (which can be installed via homebrew), on Windows this native library must be compiled into a jar as it's not contained in the official XGBoost binary on Maven Central.
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected ml.dmlc.xgboost4j.java.BoosterTransient as we rely upon the native serialisation mechanism to bytes rather than Java serializing the Booster.Fields inherited from class org.tribuo.interop.ExternalModel
DEFAULT_BATCH_SIZE, featureBackwardMapping, featureForwardMappingFields inherited from class org.tribuo.Model
ALL_OUTPUTS, BIAS_FEATURE, featureIDMap, generatesProbabilities, name, outputIDInfo, provenance, provenanceOutput -
Method Summary
Modifier and TypeMethodDescriptionprotected ml.dmlc.xgboost4j.java.DMatrixconvertFeatures(SparseVector input) Converts from a SparseVector using the external model's indices into the ingestion format for the external model.protected ml.dmlc.xgboost4j.java.DMatrixconvertFeaturesList(List<SparseVector> input) Converts from a list of SparseVector using the external model's indices into the ingestion format for the external model.protected List<Prediction<T>> convertOutput(float[][] output, int[] numValidFeatures, List<Example<T>> examples) Converts the output of the external model into a list ofPredictions.protected Prediction<T> convertOutput(float[][] output, int numValidFeatures, Example<T> example) Converts the output of the external model into aPrediction.protected XGBoostExternalModel<T> copy(String newName, ModelProvenance newProvenance) Copies a model, replacing it's provenance and name with the supplied values.static <T extends Output<T>>
XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String, Integer> featureMapping, Map<T, Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, String path) Creates anXGBoostExternalModelfrom the supplied model on disk.static <T extends Output<T>>
XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String, Integer> featureMapping, Map<T, Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, Path path) Creates anXGBoostExternalModelfrom the supplied model on disk.static <T extends Output<T>>
XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String, Integer> featureMapping, Map<T, Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, ml.dmlc.xgboost4j.java.Booster model, URL provenanceLocation) Creates anXGBoostExternalModelfrom the supplied model.protected float[][]externalPrediction(ml.dmlc.xgboost4j.java.DMatrix input) Runs the external model's prediction function.getTopFeatures(int n) Gets the topnfeatures associated with this model.Methods inherited from class org.tribuo.interop.ExternalModel
createFeatureMap, createOutputInfo, getBatchSize, getExcuse, innerPredict, predict, setBatchSizeMethods inherited from class org.tribuo.Model
copy, generatesProbabilities, getExcuses, getFeatureIDMap, getName, getOutputIDInfo, getProvenance, predict, predict, setName, toString, validate
-
Field Details
-
model
Transient as we rely upon the native serialisation mechanism to bytes rather than Java serializing the Booster.
-
-
Method Details
-
convertFeatures
Description copied from class:ExternalModelConverts from a SparseVector using the external model's indices into the ingestion format for the external model.- Specified by:
convertFeaturesin classExternalModel<T extends Output<T>, ml.dmlc.xgboost4j.java.DMatrix, float[][]>- Parameters:
input- The features using external indices.- Returns:
- The ingestion format for the external model.
-
convertFeaturesList
Description copied from class:ExternalModelConverts from a list of SparseVector using the external model's indices into the ingestion format for the external model.- Specified by:
convertFeaturesListin classExternalModel<T extends Output<T>, ml.dmlc.xgboost4j.java.DMatrix, float[][]>- Parameters:
input- The features using external indices.- Returns:
- The ingestion format for the external model.
-
externalPrediction
Description copied from class:ExternalModelRuns the external model's prediction function.- Specified by:
externalPredictionin classExternalModel<T extends Output<T>, ml.dmlc.xgboost4j.java.DMatrix, float[][]>- Parameters:
input- The input in the external model's format.- Returns:
- The output in the external model's format.
-
convertOutput
Description copied from class:ExternalModelConverts the output of the external model into aPrediction.- Specified by:
convertOutputin classExternalModel<T extends Output<T>, ml.dmlc.xgboost4j.java.DMatrix, float[][]>- Parameters:
output- The output of the external model.numValidFeatures- The number of valid features in the input.example- The input example, used to construct the Prediction.- Returns:
- A Tribuo Prediction.
-
convertOutput
protected List<Prediction<T>> convertOutput(float[][] output, int[] numValidFeatures, List<Example<T>> examples) Description copied from class:ExternalModelConverts the output of the external model into a list ofPredictions.- Specified by:
convertOutputin classExternalModel<T extends Output<T>, ml.dmlc.xgboost4j.java.DMatrix, float[][]>- Parameters:
output- The output of the external model.numValidFeatures- An array with the number of valid features in each example.examples- The input examples, used to construct the Predictions.- Returns:
- A list of Tribuo Predictions.
-
getTopFeatures
Description copied from class:ModelGets the topnfeatures associated with this model.If the model does not produce per output feature lists, it returns a map with a single element with key Model.ALL_OUTPUTS.
If the model cannot describe it's top features then it returns
Collections.emptyMap().- Specified by:
getTopFeaturesin classModel<T extends Output<T>>- Parameters:
n- the number of features to return. If this value is less than 0, all features should be returned for each class, unless the model cannot score it's features.- Returns:
- a map from string outputs to an ordered list of pairs of feature names and weights associated with that feature in the model
-
copy
Description copied from class:ModelCopies a model, replacing it's provenance and name with the supplied values.Used to provide the provenance removal functionality.
-
createXGBoostModel
public static <T extends Output<T>> XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String, Integer> featureMapping, Map<T, Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, String path) Creates anXGBoostExternalModelfrom the supplied model on disk.- Type Parameters:
T- The type of the output.- Parameters:
factory- The output factory to use.featureMapping- The feature mapping between Tribuo names and XGBoost integer ids.outputMapping- The output mapping between Tribuo outputs and XGBoost integer ids.outputFunc- The XGBoostOutputConverter function for the output type.path- The path to the model on disk.- Returns:
- An XGBoostExternalModel ready to score new inputs.
-
createXGBoostModel
public static <T extends Output<T>> XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String, Integer> featureMapping, Map<T, Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, Path path) Creates anXGBoostExternalModelfrom the supplied model on disk.- Type Parameters:
T- The type of the output.- Parameters:
factory- The output factory to use.featureMapping- The feature mapping between Tribuo names and XGBoost integer ids.outputMapping- The output mapping between Tribuo outputs and XGBoost integer ids.outputFunc- The XGBoostOutputConverter function for the output type.path- The path to the model on disk.- Returns:
- An XGBoostExternalModel ready to score new inputs.
-
createXGBoostModel
public static <T extends Output<T>> XGBoostExternalModel<T> createXGBoostModel(OutputFactory<T> factory, Map<String, Integer> featureMapping, Map<T, Integer> outputMapping, XGBoostOutputConverter<T> outputFunc, ml.dmlc.xgboost4j.java.Booster model, URL provenanceLocation) Creates anXGBoostExternalModelfrom the supplied model.Note: the provenance system requires that the URL point to a valid local file and will throw an exception if it is not. However it doesn't check that the file is where the Booster was created from. We will replace this entry point with one that accepts useful provenance information for an in-memory
Boosterobject in a future release, and deprecate this endpoint at that time.- Type Parameters:
T- The type of the output.- Parameters:
factory- The output factory to use.featureMapping- The feature mapping between Tribuo names and XGBoost integer ids.outputMapping- The output mapping between Tribuo outputs and XGBoost integer ids.outputFunc- The XGBoostOutputConverter function for the output type.model- The XGBoost model to wrap.provenanceLocation- The location where the model was loaded from.- Returns:
- An XGBoostExternalModel ready to score new inputs.
-