Interface LabelImpurity

All Superinterfaces:
com.oracle.labs.mlrg.olcut.config.Configurable, com.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>
All Known Implementing Classes:
Entropy, GiniIndex

public interface LabelImpurity extends com.oracle.labs.mlrg.olcut.config.Configurable, com.oracle.labs.mlrg.olcut.provenance.Provenancable<com.oracle.labs.mlrg.olcut.provenance.ConfiguredObjectProvenance>
Calculates a tree impurity score based on label counts, weighted label counts or a probability distribution.

Impurity scores should be non-negative, with a zero signifying that the node is pure (i.e., contains only a single label), and increasing values indicating an increasingly uniform distribution over possible labels.

  • Method Summary

    Modifier and Type
    Method
    Description
    default double
    impurity(double[] input)
    Calculates the impurity assuming the inputs are counts.
    default double
    impurity(float[] input)
    Calculates the impurity assuming the input are fractional counts.
    default double
    impurity(int[] input)
    Calculates the impurity assuming the input are counts.
    default double
    Takes a Map for weighted counts.
    double
    impurityNormed(double[] input)
    Calculates the impurity, assuming it's input is a normalized probability distribution.
    default double
    impurityWeighted(double[] input)
    Calculates the impurity assuming the inputs are weighted counts normalizing by their sum.
    default double
    impurityWeighted(float[] input)
    Calculates the impurity by assuming the input are weighted counts and converting them into a probability distribution by dividing by their sum.

    Methods inherited from interface com.oracle.labs.mlrg.olcut.config.Configurable

    postConfig

    Methods inherited from interface com.oracle.labs.mlrg.olcut.provenance.Provenancable

    getProvenance
  • Method Details

    • impurityNormed

      double impurityNormed(double[] input)
      Calculates the impurity, assuming it's input is a normalized probability distribution.
      Parameters:
      input - The input probability distribution.
      Returns:
      The impurity.
    • impurityWeighted

      default double impurityWeighted(double[] input)
      Calculates the impurity assuming the inputs are weighted counts normalizing by their sum.
      Parameters:
      input - The input counts.
      Returns:
      The impurity.
    • impurity

      default double impurity(double[] input)
      Calculates the impurity assuming the inputs are counts.
      Parameters:
      input - The input counts.
      Returns:
      The impurity.
    • impurityWeighted

      default double impurityWeighted(float[] input)
      Calculates the impurity by assuming the input are weighted counts and converting them into a probability distribution by dividing by their sum. The resulting impurity is then rescaled by multiplying by the sum.
      Parameters:
      input - The input counts.
      Returns:
      The impurity.
    • impurity

      default double impurity(float[] input)
      Calculates the impurity assuming the input are fractional counts.
      Parameters:
      input - The input counts.
      Returns:
      The impurity.
    • impurity

      default double impurity(int[] input)
      Calculates the impurity assuming the input are counts.
      Parameters:
      input - The input counts.
      Returns:
      The impurity.
    • impurity

      default double impurity(Map<String,Double> counts)
      Takes a Map for weighted counts. Normalizes into a probability distribution before calling impurityNormed(double[]).
      Parameters:
      counts - A map from instances to weighted counts.
      Returns:
      The impurity score.