Package org.tribuo.evaluation
Class EvaluationAggregator
java.lang.Object
org.tribuo.evaluation.EvaluationAggregator
Aggregates metrics from a list of evaluations, or a list of models and datasets.
-
Method Summary
Modifier and TypeMethodDescriptionstatic <T extends Output<T>,
R extends Evaluation<T>>
com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> Calculates the argmax of a metric across the supplied evaluations.static <T extends Output<T>,
C extends MetricContext<T>>
com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax
(EvaluationMetric<T, C> metric, List<? extends Model<T>> models, Dataset<T> dataset) Calculates the argmax of a metric across the supplied models (i.e., the index of the model which performed the best).static <T extends Output<T>,
C extends MetricContext<T>>
com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax
(EvaluationMetric<T, C> metric, Model<T> model, List<? extends Dataset<T>> datasets) Calculates the argmax of a metric across the supplied datasets.static <T extends Output<T>,
C extends MetricContext<T>>
DescriptiveStatssummarize
(List<? extends EvaluationMetric<T, C>> metrics, Model<T> model, List<Prediction<T>> predictions) Summarize model performance on dataset across several metrics.static <T extends Output<T>,
C extends MetricContext<T>>
DescriptiveStatssummarize
(List<? extends EvaluationMetric<T, C>> metrics, Model<T> model, Dataset<T> dataset) Summarize model performance on dataset across several metrics.static <T extends Output<T>,
R extends Evaluation<T>>
Map<MetricID<T>,DescriptiveStats> Summarize all fields of a list of evaluations.static <T extends Output<T>,
R extends Evaluation<T>>
DescriptiveStatssummarize
(List<R> evaluations, ToDoubleFunction<R> fieldGetter) Summarize a single field of an evaluation across several evaluations.static <T extends Output<T>,
R extends Evaluation<T>>
Map<MetricID<T>,DescriptiveStats> Summarize performance using the supplied evaluator across several models on one dataset.static <T extends Output<T>,
R extends Evaluation<T>>
Map<MetricID<T>,DescriptiveStats> Summarize performance according to evaluator for a single model across several datasets.static <T extends Output<T>,
C extends MetricContext<T>>
DescriptiveStatssummarize
(EvaluationMetric<T, C> metric, List<? extends Model<T>> models, Dataset<T> dataset) Summarize performance w.r.t.static <T extends Output<T>,
C extends MetricContext<T>>
DescriptiveStatssummarize
(EvaluationMetric<T, C> metric, Model<T> model, List<? extends Dataset<T>> datasets) Summarize a model's performance w.r.t.static <T extends Output<T>,
R extends Evaluation<T>>
Map<MetricID<T>,DescriptiveStats> summarizeCrossValidation
(List<com.oracle.labs.mlrg.olcut.util.Pair<R, Model<T>>> evaluations) Summarize all fields of a list of evaluations produced byCrossValidation
.
-
Method Details
-
summarize
public static <T extends Output<T>,C extends MetricContext<T>> DescriptiveStats summarize(EvaluationMetric<T, C> metric, List<? extends Model<T>> models, Dataset<T> dataset) Summarize performance w.r.t. metric across several models on a single dataset.- Type Parameters:
T
- The output type.C
- The context type used for this metric.- Parameters:
metric
- The metric to summarise.models
- The models to evaluate.dataset
- The dataset to evaluate.- Returns:
- The descriptive statistics for this metric summary.
-
summarize
public static <T extends Output<T>,R extends Evaluation<T>> Map<MetricID<T>,DescriptiveStats> summarize(Evaluator<T, R> evaluator, List<? extends Model<T>> models, Dataset<T> dataset) Summarize performance using the supplied evaluator across several models on one dataset.- Type Parameters:
T
- The output type.R
- The evaluation type.- Parameters:
evaluator
- The evaluator to use.models
- The models to evaluate.dataset
- The dataset to evaluate.- Returns:
- Descriptive statistics for each metric in the evaluator.
-
summarize
public static <T extends Output<T>,C extends MetricContext<T>> DescriptiveStats summarize(EvaluationMetric<T, C> metric, Model<T> model, List<? extends Dataset<T>> datasets) Summarize a model's performance w.r.t. a metric across several datasets.- Type Parameters:
T
- The output type.C
- The metric context type.- Parameters:
metric
- The metric to evaluate.model
- The model to evaluate.datasets
- The datasets to evaluate.- Returns:
- Descriptive statistics for the metric across the datasets.
-
summarize
public static <T extends Output<T>,C extends MetricContext<T>> DescriptiveStats summarize(List<? extends EvaluationMetric<T, C>> metrics, Model<T> model, Dataset<T> dataset) Summarize model performance on dataset across several metrics.- Type Parameters:
T
- The output type.C
- The metric context type.- Parameters:
metrics
- The metrics to evaluate.model
- The model to evaluate them on.dataset
- The dataset to evaluate them on.- Returns:
- The descriptive statistics for the metrics.
-
summarize
public static <T extends Output<T>,C extends MetricContext<T>> DescriptiveStats summarize(List<? extends EvaluationMetric<T, C>> metrics, Model<T> model, List<Prediction<T>> predictions) Summarize model performance on dataset across several metrics.- Type Parameters:
T
- The output type.C
- The metric context type.- Parameters:
metrics
- The metrics to evaluate.model
- The model to evaluate them on.predictions
- The predictions to evaluate.- Returns:
- The descriptive statistics for the metrics.
-
summarize
public static <T extends Output<T>,R extends Evaluation<T>> Map<MetricID<T>,DescriptiveStats> summarize(Evaluator<T, R> evaluator, Model<T> model, List<? extends Dataset<T>> datasets) Summarize performance according to evaluator for a single model across several datasets.- Type Parameters:
T
- The output type.R
- The evaluation type.- Parameters:
evaluator
- The evaluator to use.model
- The model to evaluate.datasets
- The datasets to evaluate across.- Returns:
- The descriptive statistics for each metric.
-
summarize
public static <T extends Output<T>,R extends Evaluation<T>> Map<MetricID<T>,DescriptiveStats> summarize(List<R> evaluations) Summarize all fields of a list of evaluations.- Type Parameters:
T
- The output type.R
- The evaluation type.- Parameters:
evaluations
- The evaluations to summarize.- Returns:
- The descriptive statistics for each metric.
-
summarizeCrossValidation
public static <T extends Output<T>,R extends Evaluation<T>> Map<MetricID<T>,DescriptiveStats> summarizeCrossValidation(List<com.oracle.labs.mlrg.olcut.util.Pair<R, Model<T>>> evaluations) Summarize all fields of a list of evaluations produced byCrossValidation
.- Type Parameters:
T
- The output type.R
- The evaluation type.- Parameters:
evaluations
- The evaluations to summarize.- Returns:
- The descriptive statistics for each metric.
-
summarize
public static <T extends Output<T>,R extends Evaluation<T>> DescriptiveStats summarize(List<R> evaluations, ToDoubleFunction<R> fieldGetter) Summarize a single field of an evaluation across several evaluations.- Type Parameters:
T
- the type of the outputR
- the type of the evaluation- Parameters:
evaluations
- the evaluationsfieldGetter
- the getter for the field to summarize- Returns:
- a descriptive stats summary of field
-
argmax
public static <T extends Output<T>,C extends MetricContext<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax(EvaluationMetric<T, C> metric, List<? extends Model<T>> models, Dataset<T> dataset) Calculates the argmax of a metric across the supplied models (i.e., the index of the model which performed the best).- Type Parameters:
T
- The output type.C
- The metric context.- Parameters:
metric
- The metric to evaluate.models
- The models to evaluate across.dataset
- The dataset to evaluate on.- Returns:
- The maximum value and it's index in the models list.
-
argmax
public static <T extends Output<T>,C extends MetricContext<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax(EvaluationMetric<T, C> metric, Model<T> model, List<? extends Dataset<T>> datasets) Calculates the argmax of a metric across the supplied datasets.- Type Parameters:
T
- The output type.C
- The metric context.- Parameters:
metric
- The metric to evaluate.model
- The model to evaluate on.datasets
- The datasets to evaluate across.- Returns:
- The maximum value and it's index in the datasets list.
-
argmax
public static <T extends Output<T>,R extends Evaluation<T>> com.oracle.labs.mlrg.olcut.util.Pair<Integer,Double> argmax(List<R> evaluations, Function<R, Double> getter) Calculates the argmax of a metric across the supplied evaluations.- Type Parameters:
T
- The output type.R
- The evaluation type.- Parameters:
evaluations
- The evaluations.getter
- The function to extract a value from the evaluation.- Returns:
- The maximum value and it's index in the evaluations list.
-