public abstract class LibLinearTrainer<T extends Output<T>> extends Object implements Trainer<T>
Trainer
which wraps a liblinear-java trainer.
Note the train method is synchronized on LibLinearTrainer.class
due to a global RNG in liblinear-java.
This is insufficient to ensure reproducibility if liblinear-java is used directly in the same JVM as Tribuo, but
avoids locking on classes Tribuo does not control.
See:
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ. "LIBLINEAR: A library for Large Linear Classification" Journal of Machine Learning Research, 2008.and for the original algorithm:
Cortes C, Vapnik V. "Support-Vector Networks" Machine Learning, 1995.
Modifier and Type | Field and Description |
---|---|
protected double |
cost |
protected double |
epsilon |
protected de.bwaldvogel.liblinear.Parameter |
libLinearParams |
protected int |
maxIterations |
protected double |
terminationCriterion |
protected LibLinearType<T> |
trainerType |
DEFAULT_SEED
Modifier | Constructor and Description |
---|---|
protected |
LibLinearTrainer() |
protected |
LibLinearTrainer(LibLinearType<T> trainerType,
double cost,
int maxIterations,
double terminationCriterion)
Creates a trainer for a LibLinear model
|
protected |
LibLinearTrainer(LibLinearType<T> trainerType,
double cost,
int maxIterations,
double terminationCriterion,
double epsilon)
Creates a trainer for a LibLinear model
|
Modifier and Type | Method and Description |
---|---|
protected abstract LibLinearModel<T> |
createModel(ModelProvenance provenance,
ImmutableFeatureMap featureIDMap,
ImmutableOutputInfo<T> outputIDInfo,
List<de.bwaldvogel.liblinear.Model> models)
Construct the appropriate subtype of LibLinearModel for the prediction task.
|
static <T extends Output<T>> |
exampleToNodes(Example<T> example,
ImmutableFeatureMap featureIDMap,
List<de.bwaldvogel.liblinear.FeatureNode> features)
Converts a Tribuo
Example into a liblinear FeatureNode array, including a bias feature. |
protected abstract com.oracle.labs.mlrg.olcut.util.Pair<de.bwaldvogel.liblinear.FeatureNode[][],double[][]> |
extractData(Dataset<T> data,
ImmutableOutputInfo<T> outputInfo,
ImmutableFeatureMap featureMap)
Extracts the features and
Output s in LibLinear's format. |
int |
getInvocationCount()
The number of times this trainer instance has had it's train method invoked.
|
TrainerProvenance |
getProvenance() |
void |
postConfig()
Used by the OLCUT configuration system, and should not be called by external code.
|
protected de.bwaldvogel.liblinear.Parameter |
setupParameters(ImmutableOutputInfo<T> info)
Constructs the parameters.
|
String |
toString() |
LibLinearModel<T> |
train(Dataset<T> examples)
Trains a predictive model using the examples in the given data set.
|
LibLinearModel<T> |
train(Dataset<T> examples,
Map<String,com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)
Trains a predictive model using the examples in the given data set.
|
protected abstract List<de.bwaldvogel.liblinear.Model> |
trainModels(de.bwaldvogel.liblinear.Parameter curParams,
int numFeatures,
de.bwaldvogel.liblinear.FeatureNode[][] features,
double[][] outputs)
Train all the liblinear instances necessary for this dataset.
|
protected de.bwaldvogel.liblinear.Parameter libLinearParams
@Config(description="Algorithm to use.") protected LibLinearType<T extends Output<T>> trainerType
@Config(description="Cost penalty for misclassifications.") protected double cost
@Config(description="Maximum number of iterations before terminating.") protected int maxIterations
@Config(description="Stop iterating when the loss score decreases by less than this value.") protected double terminationCriterion
@Config(description="Epsilon insensitivity in the regression cost function.") protected double epsilon
protected LibLinearTrainer()
protected LibLinearTrainer(LibLinearType<T> trainerType, double cost, int maxIterations, double terminationCriterion)
trainerType
- Loss function and optimisation method combination.cost
- Cost penalty for each incorrectly classified training point.maxIterations
- The maximum number of dataset iterations.terminationCriterion
- How close does the optimisation function need to be before terminating that subproblem (usually set to 0.1).protected LibLinearTrainer(LibLinearType<T> trainerType, double cost, int maxIterations, double terminationCriterion, double epsilon)
trainerType
- Loss function and optimisation method combination.cost
- Cost penalty for each incorrectly classified training point.maxIterations
- The maximum number of dataset iterations.terminationCriterion
- How close does the optimisation function need to be before terminating that subproblem (usually set to 0.1).epsilon
- The insensitivity of the regression loss to small differences.public void postConfig()
postConfig
in interface com.oracle.labs.mlrg.olcut.config.Configurable
public LibLinearModel<T> train(Dataset<T> examples)
Trainer
public LibLinearModel<T> train(Dataset<T> examples, Map<String,com.oracle.labs.mlrg.olcut.provenance.Provenance> runProvenance)
Trainer
public int getInvocationCount()
Trainer
This is used to determine how many times the trainer's RNG has been accessed to ensure replicability in the random number stream.
getInvocationCount
in interface Trainer<T extends Output<T>>
protected abstract List<de.bwaldvogel.liblinear.Model> trainModels(de.bwaldvogel.liblinear.Parameter curParams, int numFeatures, de.bwaldvogel.liblinear.FeatureNode[][] features, double[][] outputs)
curParams
- The LibLinear parameters.numFeatures
- The number of features in this dataset.features
- The features themselves.outputs
- The outputs.protected abstract LibLinearModel<T> createModel(ModelProvenance provenance, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<T> outputIDInfo, List<de.bwaldvogel.liblinear.Model> models)
provenance
- The model provenance.featureIDMap
- The feature id map.outputIDInfo
- The output id info.models
- The list of linear models.protected abstract com.oracle.labs.mlrg.olcut.util.Pair<de.bwaldvogel.liblinear.FeatureNode[][],double[][]> extractData(Dataset<T> data, ImmutableOutputInfo<T> outputInfo, ImmutableFeatureMap featureMap)
Output
s in LibLinear's format.data
- The input data.outputInfo
- The output info.featureMap
- The feature info.protected de.bwaldvogel.liblinear.Parameter setupParameters(ImmutableOutputInfo<T> info)
info
- The output info.public static <T extends Output<T>> de.bwaldvogel.liblinear.FeatureNode[] exampleToNodes(Example<T> example, ImmutableFeatureMap featureIDMap, List<de.bwaldvogel.liblinear.FeatureNode> features)
Example
into a liblinear FeatureNode
array, including a bias feature.
If there is a collision between feature ids (i.e., if there is feature hashing or some other mechanism changing the feature ids) then the feature values are summed.
T
- The output type.example
- The input example.featureIDMap
- The feature id map which contains the example's indices.features
- A buffer. If null then an array list is created and used internally.public TrainerProvenance getProvenance()
getProvenance
in interface com.oracle.labs.mlrg.olcut.provenance.Provenancable<TrainerProvenance>
Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.