Class LibLinearRegressionTrainer

java.lang.Object
org.tribuo.common.liblinear.LibLinearTrainer<Regressor>
org.tribuo.regression.liblinear.LibLinearRegressionTrainer
All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable, com.oracle.labs.mlrg.olcut.provenance.Provenancable<TrainerProvenance>, Trainer<Regressor>

public class LibLinearRegressionTrainer extends LibLinearTrainer<Regressor>
A Trainer which wraps a liblinear-java regression trainer.

This generates an independent liblinear model for each regression dimension.

Note the train method is synchronized on LibLinearTrainer.class due to a global RNG in liblinear-java. This is insufficient to ensure reproducibility if liblinear-java is used directly in the same JVM as Tribuo, but avoids locking on classes Tribuo does not control.

See:

 Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ.
 "LIBLINEAR: A library for Large Linear Classification"
 Journal of Machine Learning Research, 2008.
 
and for the original algorithm:
 Cortes C, Vapnik V.
 "Support-Vector Networks"
 Machine Learning, 1995.
 
  • Constructor Details

    • LibLinearRegressionTrainer

      public LibLinearRegressionTrainer()
      Creates a trainer using the default values (L2R_L2LOSS_SVR, 1, 1000, 0.1, 0.1).
    • LibLinearRegressionTrainer

      public LibLinearRegressionTrainer(LinearRegressionType trainerType)
      Creates a trainer for a LibLinear regression model.

      Uses the default values of cost = 1.0, maxIterations = 1000, terminationCriterion = 0.1, epsilon = 0.1.

      Parameters:
      trainerType - Loss function and optimisation method.
    • LibLinearRegressionTrainer

      public LibLinearRegressionTrainer(LinearRegressionType trainerType, double cost, int maxIterations, double terminationCriterion, double epsilon)
      Creates a trainer for a LibLinear regression model.
      Parameters:
      trainerType - Loss function and optimisation method combination.
      cost - Cost penalty for each incorrectly classified training point.
      maxIterations - The maximum number of dataset iterations.
      terminationCriterion - How close does the optimisation function need to be before terminating that subproblem (usually set to 0.1).
      epsilon - The insensitivity of the regression loss to small differences.
  • Method Details

    • postConfig

      public void postConfig()
      Used by the OLCUT configuration system, and should not be called by external code.
      Specified by:
      postConfig in interface com.oracle.labs.mlrg.olcut.config.Configurable
      Overrides:
      postConfig in class LibLinearTrainer<Regressor>
    • trainModels

      protected List<de.bwaldvogel.liblinear.Model> trainModels(de.bwaldvogel.liblinear.Parameter curParams, int numFeatures, de.bwaldvogel.liblinear.FeatureNode[][] features, double[][] outputs)
      Description copied from class: LibLinearTrainer
      Train all the liblinear instances necessary for this dataset.
      Specified by:
      trainModels in class LibLinearTrainer<Regressor>
      Parameters:
      curParams - The LibLinear parameters.
      numFeatures - The number of features in this dataset.
      features - The features themselves.
      outputs - The outputs.
      Returns:
      A list of liblinear models.
    • createModel

      protected LibLinearModel<Regressor> createModel(ModelProvenance provenance, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<Regressor> outputIDInfo, List<de.bwaldvogel.liblinear.Model> models)
      Description copied from class: LibLinearTrainer
      Construct the appropriate subtype of LibLinearModel for the prediction task.
      Specified by:
      createModel in class LibLinearTrainer<Regressor>
      Parameters:
      provenance - The model provenance.
      featureIDMap - The feature id map.
      outputIDInfo - The output id info.
      models - The list of linear models.
      Returns:
      An implementation of LibLinearModel.
    • extractData

      protected com.oracle.labs.mlrg.olcut.util.Pair<de.bwaldvogel.liblinear.FeatureNode[][],double[][]> extractData(Dataset<Regressor> data, ImmutableOutputInfo<Regressor> outputInfo, ImmutableFeatureMap featureMap)
      Description copied from class: LibLinearTrainer
      Extracts the features and Outputs in LibLinear's format.
      Specified by:
      extractData in class LibLinearTrainer<Regressor>
      Parameters:
      data - The input data.
      outputInfo - The output info.
      featureMap - The feature info.
      Returns:
      The features and outputs.