Class XGBoostModel<T extends Output<T>>
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.provenance.Provenancable<ModelProvenance>
,Serializable
,ProtoSerializable<org.tribuo.protos.core.ModelProto>
Model
which wraps around a XGBoost.Booster.
XGBoost is a fast implementation of gradient boosted decision trees.
Throws IllegalStateException if the XGBoost C++ library fails to load or throws an exception.
See:
Chen T, Guestrin C. "XGBoost: A Scalable Tree Boosting System" Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.and for the original algorithm:
Friedman JH. "Greedy Function Approximation: a Gradient Boosting Machine" Annals of statistics, 2001.
N.B.: XGBoost4J wraps the native C implementation of xgboost that links to various C libraries, including libgomp and glibc (on Linux). If you're running on Alpine, which does not natively use glibc, you'll need to install glibc into the container. On the macOS binary on Maven Central is compiled without OpenMP support, meaning that XGBoost is single threaded on macOS. You can recompile the macOS binary with OpenMP support after installing libomp from homebrew if necessary.
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
Protobuf serialization version.protected List<ml.dmlc.xgboost4j.java.Booster>
The XGBoost4J Boosters.Fields inherited from class org.tribuo.Model
ALL_OUTPUTS, BIAS_FEATURE, featureIDMap, generatesProbabilities, name, outputIDInfo, provenance, provenanceOutput
Fields inherited from interface org.tribuo.protos.ProtoSerializable
DESERIALIZATION_METHOD_NAME, PROVENANCE_SERIALIZER
-
Method Summary
Modifier and TypeMethodDescriptioncopy
(String newName, ModelProvenance newProvenance) Copies a model, replacing its provenance and name with the supplied values.static XGBoostModel<?>
deserializeFromProto
(int version, String className, com.google.protobuf.Any message) Deserialization factory.Generates an excuse for an example.Creates objects to report feature importance metrics for XGBoost.List<ml.dmlc.xgboost4j.java.Booster>
Returns an unmodifiable list containing a copy of each model.Returns the string model dumps from each Booster.getTopFeatures
(int n) Gets the topn
features associated with this model.List<Prediction<T>>
Uses the model to predict the label for multiple examples.List<Prediction<T>>
Uses the model to predict the labels for multiple examples contained in a data set.Uses the model to predict the output for a single example.org.tribuo.protos.core.ModelProto
Serializes this object to a protobuf.void
setNumThreads
(int threads) Sets the number of threads to use at prediction time.Methods inherited from class org.tribuo.Model
castModel, copy, createDataCarrier, deserialize, deserializeFromFile, deserializeFromStream, generatesProbabilities, getExcuses, getFeatureIDMap, getName, getOutputIDInfo, getProvenance, innerPredict, serializeToFile, serializeToStream, setName, toString, validate
-
Field Details
-
CURRENT_VERSION
public static final int CURRENT_VERSIONProtobuf serialization version.- See Also:
-
models
The XGBoost4J Boosters.
-
-
Method Details
-
deserializeFromProto
public static XGBoostModel<?> deserializeFromProto(int version, String className, com.google.protobuf.Any message) throws com.google.protobuf.InvalidProtocolBufferException, ml.dmlc.xgboost4j.java.XGBoostError, IOException Deserialization factory.- Parameters:
version
- The serialized object version.className
- The class name.message
- The serialized data.- Returns:
- The deserialized object.
- Throws:
com.google.protobuf.InvalidProtocolBufferException
- If the protobuf could not be parsed from themessage
.ml.dmlc.xgboost4j.java.XGBoostError
- If the XGBoost byte array failed to parse.IOException
- If the XGBoost byte array failed to parse.
-
getInnerModels
Returns an unmodifiable list containing a copy of each model.As XGBoost4J models don't expose a copy constructor this requires serializing each model to a byte array and rebuilding it, and is thus quite expensive.
- Returns:
- A copy of all of the models.
-
setNumThreads
public void setNumThreads(int threads) Sets the number of threads to use at prediction time.If set to 0 sets nthreads = num hardware threads.
- Parameters:
threads
- The new number of threads.
-
predict
Uses the model to predict the labels for multiple examples contained in a data set. -
predict
Uses the model to predict the label for multiple examples. -
predict
Description copied from class:Model
Uses the model to predict the output for a single example.predict does not mutate the example.
Throws
IllegalArgumentException
if the example has no features or no feature overlap with the model. -
getFeatureImportance
Creates objects to report feature importance metrics for XGBoost. See the documentation ofXGBoostFeatureImportance
for more information on what those metrics mean. Typically this list will contain a single instance for the entire model. For multidimensional regression the list will have one entry per dimension, in dimension order.- Returns:
- The feature importance object(s).
-
getTopFeatures
Description copied from class:Model
Gets the topn
features associated with this model.If the model does not produce per output feature lists, it returns a map with a single element with key Model.ALL_OUTPUTS.
If the model cannot describe it's top features then it returns
Collections.emptyMap()
.- Specified by:
getTopFeatures
in classModel<T extends Output<T>>
- Parameters:
n
- the number of features to return. If this value is less than 0, all features should be returned for each class, unless the model cannot score it's features.- Returns:
- a map from string outputs to an ordered list of pairs of feature names and weights associated with that feature in the model
-
getModelDump
Returns the string model dumps from each Booster.- Returns:
- The model dumps.
-
getExcuse
Description copied from class:Model
Generates an excuse for an example.This attempts to explain a classification result. Generating an excuse may be quite an expensive operation.
This excuse either contains per class information or an entry with key Model.ALL_OUTPUTS.
The optional is empty if the model does not provide excuses.
-
copy
Description copied from class:Model
Copies a model, replacing its provenance and name with the supplied values.Used to provide the provenance removal functionality.
-
serialize
public org.tribuo.protos.core.ModelProto serialize()Description copied from interface:ProtoSerializable
Serializes this object to a protobuf.
-