Class JMI

java.lang.Object
org.tribuo.classification.fs.JMI
All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable, com.oracle.labs.mlrg.olcut.provenance.Provenancable<FeatureSelectorProvenance>, FeatureSelector<Label>

public final class JMI extends Object implements FeatureSelector<Label>
Selects features according to the Joint Mutual Information algorithm.

Uses equal width binning for the feature values.

See:

 Yang H, Moody J.
 "Data Visualization and Feature Selection: New Algorithms for Non-Gaussian Data"
 Advances in Neural Information Processing Systems (NIPS), 1999.
 
and
 Brown G, Pocock A, Zhao M-J, Lujan M.
 "Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection"
 Journal of Machine Learning Research (JMLR), 2012, PDF.
 
  • Constructor Details

    • JMI

      public JMI(int k, int numBins, int numThreads)
      Constructs a JMI feature selector that ranks the top k features.

      Continuous features are binned into numBins equal width bins.

      Parameters:
      numBins - The number of bins, must be greater than 1.
      k - The number of features to rank.
      numThreads - The number of computation threads to use.
  • Method Details

    • postConfig

      public void postConfig()
      Used by the OLCUT configuration system, and should not be called by external code.
      Specified by:
      postConfig in interface com.oracle.labs.mlrg.olcut.config.Configurable
    • isOrdered

      public boolean isOrdered()
      Description copied from interface: FeatureSelector
      Does this feature selection algorithm return an ordered feature set?
      Specified by:
      isOrdered in interface FeatureSelector<Label>
      Returns:
      True if the set is ordered.
    • select

      public SelectedFeatureSet select(Dataset<Label> dataset)
      Description copied from interface: FeatureSelector
      Selects features according to this selection algorithm from the specified dataset.
      Specified by:
      select in interface FeatureSelector<Label>
      Parameters:
      dataset - The dataset to use.
      Returns:
      A selected feature set.
    • getProvenance

      public FeatureSelectorProvenance getProvenance()
      Specified by:
      getProvenance in interface com.oracle.labs.mlrg.olcut.provenance.Provenancable<FeatureSelectorProvenance>