Class EvaluationAggregator

java.lang.Object
org.tribuo.evaluation.EvaluationAggregator

public final class EvaluationAggregator extends Object
Aggregates metrics from a list of evaluations, or a list of models and datasets.
  • Method Details

    • summarize

      public static <T extends Output<T>, C extends MetricContext<T>> DescriptiveStats summarize(EvaluationMetric<T,C> metric, List<? extends Model<T>> models, Dataset<T> dataset)
      Summarize performance w.r.t. metric across several models on a single dataset.
      Type Parameters:
      T - The output type.
      C - The context type used for this metric.
      Parameters:
      metric - The metric to summarise.
      models - The models to evaluate.
      dataset - The dataset to evaluate.
      Returns:
      The descriptive statistics for this metric summary.
    • summarize

      public static <T extends Output<T>, R extends Evaluation<T>> Map<MetricID<T>,DescriptiveStats> summarize(Evaluator<T,R> evaluator, List<? extends Model<T>> models, Dataset<T> dataset)
      Summarize performance using the supplied evaluator across several models on one dataset.
      Type Parameters:
      T - The output type.
      R - The evaluation type.
      Parameters:
      evaluator - The evaluator to use.
      models - The models to evaluate.
      dataset - The dataset to evaluate.
      Returns:
      Descriptive statistics for each metric in the evaluator.
    • summarize

      public static <T extends Output<T>, C extends MetricContext<T>> DescriptiveStats summarize(EvaluationMetric<T,C> metric, Model<T> model, List<? extends Dataset<T>> datasets)
      Summarize a model's performance w.r.t. a metric across several datasets.
      Type Parameters:
      T - The output type.
      C - The metric context type.
      Parameters:
      metric - The metric to evaluate.
      model - The model to evaluate.
      datasets - The datasets to evaluate.
      Returns:
      Descriptive statistics for the metric across the datasets.
    • summarize

      public static <T extends Output<T>, C extends MetricContext<T>> DescriptiveStats summarize(List<? extends EvaluationMetric<T,C>> metrics, Model<T> model, Dataset<T> dataset)
      Summarize model performance on dataset across several metrics.
      Type Parameters:
      T - The output type.
      C - The metric context type.
      Parameters:
      metrics - The metrics to evaluate.
      model - The model to evaluate them on.
      dataset - The dataset to evaluate them on.
      Returns:
      The descriptive statistics for the metrics.
    • summarize

      public static <T extends Output<T>, C extends MetricContext<T>> DescriptiveStats summarize(List<? extends EvaluationMetric<T,C>> metrics, Model<T> model, List<Prediction<T>> predictions)
      Summarize model performance on dataset across several metrics.
      Type Parameters:
      T - The output type.
      C - The metric context type.
      Parameters:
      metrics - The metrics to evaluate.
      model - The model to evaluate them on.
      predictions - The predictions to evaluate.
      Returns:
      The descriptive statistics for the metrics.
    • summarize

      public static <T extends Output<T>, R extends Evaluation<T>> Map<MetricID<T>,DescriptiveStats> summarize(Evaluator<T,R> evaluator, Model<T> model, List<? extends Dataset<T>> datasets)
      Summarize performance according to evaluator for a single model across several datasets.
      Type Parameters:
      T - The output type.
      R - The evaluation type.
      Parameters:
      evaluator - The evaluator to use.
      model - The model to evaluate.
      datasets - The datasets to evaluate across.
      Returns:
      The descriptive statistics for each metric.
    • summarize

      public static <T extends Output<T>, R extends Evaluation<T>> Map<MetricID<T>,DescriptiveStats> summarize(List<R> evaluations)
      Summarize all fields of a list of evaluations.
      Type Parameters:
      T - The output type.
      R - The evaluation type.
      Parameters:
      evaluations - The evaluations to summarize.
      Returns:
      The descriptive statistics for each metric.
    • summarize

      public static <T extends Output<T>, R extends Evaluation<T>> DescriptiveStats summarize(List<R> evaluations, ToDoubleFunction<R> fieldGetter)
      Summarize a single field of an evaluation across several evaluations.
      Type Parameters:
      T - the type of the output
      R - the type of the evaluation
      Parameters:
      evaluations - the evaluations
      fieldGetter - the getter for the field to summarize
      Returns:
      a descriptive stats summary of field
    • argmax

      public static <T extends Output<T>, C extends MetricContext<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax(EvaluationMetric<T,C> metric, List<? extends Model<T>> models, Dataset<T> dataset)
      Calculates the argmax of a metric across the supplied models (i.e., the index of the model which performed the best).
      Type Parameters:
      T - The output type.
      C - The metric context.
      Parameters:
      metric - The metric to evaluate.
      models - The models to evaluate across.
      dataset - The dataset to evaluate on.
      Returns:
      The maximum value and it's index in the models list.
    • argmax

      public static <T extends Output<T>, C extends MetricContext<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax(EvaluationMetric<T,C> metric, Model<T> model, List<? extends Dataset<T>> datasets)
      Calculates the argmax of a metric across the supplied datasets.
      Type Parameters:
      T - The output type.
      C - The metric context.
      Parameters:
      metric - The metric to evaluate.
      model - The model to evaluate on.
      datasets - The datasets to evaluate across.
      Returns:
      The maximum value and it's index in the datasets list.
    • argmax

      public static <T extends Output<T>, R extends Evaluation<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax(List<R> evaluations, Function<R,Double> getter)
      Calculates the argmax of a metric across the supplied evaluations.
      Type Parameters:
      T - The output type.
      R - The evaluation type.
      Parameters:
      evaluations - The evaluations.
      getter - The function to extract a value from the evaluation.
      Returns:
      The maximum value and it's index in the evaluations list.