Class HashingTrainer<T extends Output<T>>

java.lang.Object
org.tribuo.hash.HashingTrainer<T>
Type Parameters:
T - The type of Output this trainer works with.
All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable, com.oracle.labs.mlrg.olcut.provenance.Provenancable<TrainerProvenance>, Trainer<T>

public final class HashingTrainer<T extends Output<T>> extends Object implements Trainer<T>
A Trainer which hashes the Dataset before the Model is produced. This means the model does not contain any feature names, only one way hashes of names.

It wraps another Trainer which actually performs the training.

  • Constructor Details

  • Method Details

    • train

      public Model<T> train(Dataset<T> dataset, Map<String, com.oracle.labs.mlrg.olcut.provenance.Provenance> instanceProvenance)
      This clones the Dataset, hashes each of the examples and rewrites their feature ids before passing it to the inner trainer.

      This ensures the Trainer sees the data after the collisions, and thus builds the correct size data structures.

      Specified by:
      train in interface Trainer<T extends Output<T>>
      Parameters:
      dataset - The input dataset.
      instanceProvenance - Provenance information specific to this execution of train (e.g., cross validation fold number).
      Returns:
      A trained Model.
    • getInvocationCount

      public int getInvocationCount()
      Description copied from interface: Trainer
      The number of times this trainer instance has had it's train method invoked.

      This is used to determine how many times the trainer's RNG has been accessed to ensure replicability in the random number stream.

      Specified by:
      getInvocationCount in interface Trainer<T extends Output<T>>
      Returns:
      The number of train invocations.
    • getProvenance

      Specified by:
      getProvenance in interface com.oracle.labs.mlrg.olcut.provenance.Provenancable<T extends Output<T>>