Class WeightedInformationTheory
java.lang.Object
org.tribuo.util.infotheory.WeightedInformationTheory
A class of (discrete) weighted information theoretic functions. Gives warnings if
there are insufficient samples to estimate the quantities accurately.
Defaults to log_2, so returns values in bits.
All functions expect that the element types have well defined equals and hashcode, and that equals is consistent with hashcode. The behaviour is undefined if this is not true.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enum
Chooses which variable is the one with associated weights. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
static final double
static double
Sets the base of the logarithm used in the information theoretic calculations.static final double
static final double
-
Method Summary
Modifier and TypeMethodDescriptionstatic <T> Map
<T, WeightCountTuple> calculateWeightedCountDist
(ArrayList<T> vector, ArrayList<Double> weights) Generate the counts for a single vector.static <T1,
T2, T3> double Calculates the discrete weighted conditional mutual information, using histogram probability estimators.static <T1,
T2, T3> double conditionalMI
(TripleDistribution<T1, T2, T3> rv, Map<?, Double> weights, WeightedInformationTheory.VariableSelector vs) static <T1,
T2, T3> double conditionalMI
(WeightedTripleDistribution<T1, T2, T3> tripleRV) static <T1,
T2> double jointEntropy
(ArrayList<T1> first, ArrayList<T2> second, ArrayList<Double> weights) Calculates the Shannon/Guiasu weighted joint entropy of two arrays, using histogram probability estimators.static <T1,
T2, T3> double Calculates the discrete weighted joint mutual information, using histogram probability estimators.static <T1,
T2, T3> double jointMI
(TripleDistribution<T1, T2, T3> rv, Map<?, Double> weights, WeightedInformationTheory.VariableSelector vs) static <T1,
T2, T3> double jointMI
(WeightedTripleDistribution<T1, T2, T3> tripleRV) static <T1,
T2> double Calculates the discrete weighted mutual information, using histogram probability estimators.static <T1,
T2> double mi
(PairDistribution<T1, T2> pairDist, Map<?, Double> weights, WeightedInformationTheory.VariableSelector vs) static <T1,
T2> double mi
(WeightedPairDistribution<T1, T2> jointDist) static <T> void
normaliseWeights
(Map<T, WeightCountTuple> map) Normalizes the weights in the map, i.e., divides each weight by it's count.static <T1,
T2> double weightedConditionalEntropy
(ArrayList<T1> vector, ArrayList<T2> condition, ArrayList<Double> weights) Calculates the discrete Shannon/Guiasu Weighted Conditional Entropy of two arrays, using histogram probability estimators.static <T> double
weightedEntropy
(ArrayList<T> vector, ArrayList<Double> weights) Calculates the discrete Shannon/Guiasu Weighted Entropy, using histogram probability estimators.
-
Field Details
-
SAMPLES_RATIO
- See Also:
-
DEFAULT_MAP_SIZE
- See Also:
-
LOG_2
-
LOG_E
-
LOG_BASE
Sets the base of the logarithm used in the information theoretic calculations. For LOG_2 the unit is "bit", for LOG_E the unit is "nat".
-
-
Method Details
-
jointMI
public static <T1,T2, double jointMIT3> (List<T1> first, List<T2> second, List<T3> target, List<Double> weights) Calculates the discrete weighted joint mutual information, using histogram probability estimators. Arrays must be the same length.- Type Parameters:
T1
- Type contained in the first array.T2
- Type contained in the second array.T3
- Type contained in the target array.- Parameters:
first
- An array of values.second
- Another array of values.target
- Target array of values.weights
- Array of weight values.- Returns:
- The mutual information I(first,second;joint)
-
jointMI
-
jointMI
public static <T1,T2, double jointMIT3> (TripleDistribution<T1, T2, T3> rv, Map<?, Double> weights, WeightedInformationTheory.VariableSelector vs) -
conditionalMI
public static <T1,T2, double conditionalMIT3> (List<T1> first, List<T2> second, List<T3> condition, List<Double> weights) Calculates the discrete weighted conditional mutual information, using histogram probability estimators. Arrays must be the same length.- Type Parameters:
T1
- Type contained in the first array.T2
- Type contained in the second array.T3
- Type contained in the condition array.- Parameters:
first
- An array of values.second
- Another array of values.condition
- Array to condition upon.weights
- Array of weight values.- Returns:
- The conditional mutual information I(first;second|condition)
-
conditionalMI
-
conditionalMI
public static <T1,T2, double conditionalMIT3> (TripleDistribution<T1, T2, T3> rv, Map<?, Double> weights, WeightedInformationTheory.VariableSelector vs) -
mi
public static <T1,T2> double mi(ArrayList<T1> first, ArrayList<T2> second, ArrayList<Double> weights) Calculates the discrete weighted mutual information, using histogram probability estimators.Arrays must be the same length.
- Type Parameters:
T1
- Type of the first arrayT2
- Type of the second array- Parameters:
first
- An array of valuessecond
- Another array of valuesweights
- Array of weight values.- Returns:
- The mutual information I(first;Second)
-
mi
-
mi
public static <T1,T2> double mi(PairDistribution<T1, T2> pairDist, Map<?, Double> weights, WeightedInformationTheory.VariableSelector vs) -
jointEntropy
public static <T1,T2> double jointEntropy(ArrayList<T1> first, ArrayList<T2> second, ArrayList<Double> weights) Calculates the Shannon/Guiasu weighted joint entropy of two arrays, using histogram probability estimators.Arrays must be same length.
- Type Parameters:
T1
- Type of the first array.T2
- Type of the second array.- Parameters:
first
- An array of values.second
- Another array of values.weights
- Array of weight values.- Returns:
- The entropy H(first,second)
-
weightedConditionalEntropy
public static <T1,T2> double weightedConditionalEntropy(ArrayList<T1> vector, ArrayList<T2> condition, ArrayList<Double> weights) Calculates the discrete Shannon/Guiasu Weighted Conditional Entropy of two arrays, using histogram probability estimators.Arrays must be the same length.
- Type Parameters:
T1
- Type of the first array.T2
- Type of the second array.- Parameters:
vector
- The main array of values.condition
- The array to condition on.weights
- Array of weight values.- Returns:
- The weighted conditional entropy H_w(vector|condition).
-
weightedEntropy
Calculates the discrete Shannon/Guiasu Weighted Entropy, using histogram probability estimators.- Type Parameters:
T
- Type of the array.- Parameters:
vector
- The array of values.weights
- Array of weight values.- Returns:
- The weighted entropy H_w(vector).
-
calculateWeightedCountDist
public static <T> Map<T, WeightCountTuple> calculateWeightedCountDist(ArrayList<T> vector, ArrayList<Double> weights) Generate the counts for a single vector.- Type Parameters:
T
- The type inside the vector.- Parameters:
vector
- An array of values.weights
- The array of weight values.- Returns:
- A HashMap from states of T to Pairs of count and total weight for that state.
-
normaliseWeights
Normalizes the weights in the map, i.e., divides each weight by it's count.- Type Parameters:
T
- The type of the variable that was counted.- Parameters:
map
- The map to normalize.
-