Class Regressor

java.lang.Object
org.tribuo.regression.Regressor
All Implemented Interfaces:
Serializable, Iterable<Regressor.DimensionTuple>, Output<Regressor>, ProtoSerializable<org.tribuo.protos.core.OutputProto>
Direct Known Subclasses:
Regressor.DimensionTuple

public class Regressor extends Object implements Output<Regressor>, Iterable<Regressor.DimensionTuple>
An Output for n-dimensional real valued regression.

In addition to the regressed values, it may optionally contain variances. Otherwise the variances are set to Double.NaN.

Within a DataSource or Dataset each Regressor must contain the same set of named dimensions. The dimensions stored in a Regressor are sorted by the natural ordering of their names (i.e., using the String comparator). This allows the use of direct indexing into the elements.

Note fullEquals(org.tribuo.regression.Regressor) compares the dimensions, the regressed values and the variances. However unlike Double.equals(java.lang.Object), if the two variances being compared are set to the sentinel value of Double.NaN, then they are considered equal.

See Also:
  • Field Details

    • TOLERANCE

      public static final double TOLERANCE
      The tolerance value for determining if two regressed values are equal.
      See Also:
    • DEFAULT_NAME

      public static final String DEFAULT_NAME
      Default name used for dimensions which are unnamed when parsed from Strings.
      See Also:
  • Constructor Details

    • Regressor

      public Regressor(String[] names, double[] values, double[] variances)
      Constructs a regressor from the supplied named values. Throws IllegalArgumentException if the arrays are not all the same size.
      Parameters:
      names - The names of the dimensions.
      values - The values of the dimensions.
      variances - The variances of the specified values.
    • Regressor

      public Regressor(String[] names, double[] values)
      Constructs a regressor from the supplied named values. Uses Double.NaN as the variances.
      Parameters:
      names - The names of the dimensions.
      values - The values of the dimensions.
    • Regressor

      public Regressor(Regressor.DimensionTuple[] dimensions)
      Constructs a regressor from the supplied dimension tuples.
      Parameters:
      dimensions - The named values to use.
    • Regressor

      public Regressor(String name, double value)
      Constructs a regressor containing a single dimension, using Double.NaN as the variance.
      Parameters:
      name - The name of the dimension.
      value - The value of the dimension.
    • Regressor

      public Regressor(String name, double value, double variance)
      Constructs a regressor containing a single dimension.
      Parameters:
      name - The name of the dimension.
      value - The value of the dimension.
      variance - The variance of this value.
  • Method Details

    • deserializeFromProto

      public static Regressor deserializeFromProto(int version, String className, com.google.protobuf.Any message) throws com.google.protobuf.InvalidProtocolBufferException
      Deserialization factory.
      Parameters:
      version - The serialized object version.
      className - The class name.
      message - The serialized data.
      Returns:
      The deserialized object.
      Throws:
      com.google.protobuf.InvalidProtocolBufferException - If the protobuf could not be parsed from the message.
    • serialize

      public org.tribuo.protos.core.OutputProto serialize()
      Description copied from interface: ProtoSerializable
      Serializes this object to a protobuf.
      Specified by:
      serialize in interface ProtoSerializable<org.tribuo.protos.core.OutputProto>
      Returns:
      The protobuf.
    • size

      public int size()
      Returns the number of dimensions in this regressor.
      Returns:
      The number of dimensions.
    • getNames

      public String[] getNames()
      The names of the dimensions. Always sorted by their natural ordering.
      Returns:
      The names of the dimensions.
    • getValues

      public double[] getValues()
      Returns the regression values.

      The index corresponds to the index of the name.

      In a single dimensional regression this is a single element array.

      Returns:
      The regression values.
    • getVariances

      public double[] getVariances()
      The variances of the regressed values, if known. If the variances are unknown (or not set by the model) they are filled with Double.NaN.

      The index corresponds to the index of the name.

      In a single dimensional regression this is a single element array.

      Returns:
      The variance of the regressed values.
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • getDimension

      public Optional<Regressor.DimensionTuple> getDimension(String name)
      Returns a dimension tuple for the requested dimension, or optional empty if it's not valid.
      Parameters:
      name - The dimension name.
      Returns:
      A tuple representing that dimension.
    • getDimension

      public Regressor.DimensionTuple getDimension(int idx)
      Returns a dimension tuple for the requested dimension index.
      Parameters:
      idx - The dimension index.
      Returns:
      A tuple representing that dimension.
      Throws:
      IndexOutOfBoundsException - if the index is outside the range.
    • iterator

      public Iterator<Regressor.DimensionTuple> iterator()
      Specified by:
      iterator in interface Iterable<Regressor.DimensionTuple>
    • copy

      public Regressor copy()
      Description copied from interface: Output
      Deep copy of the output up to its immutable state.
      Specified by:
      copy in interface Output<Regressor>
      Returns:
      A copy of the output.
    • getSerializableForm

      public String getSerializableForm(boolean includeConfidence)
      Description copied from interface: Output
      Generates a String suitable for writing to a csv or json file.
      Specified by:
      getSerializableForm in interface Output<Regressor>
      Parameters:
      includeConfidence - Include whatever confidence score the label contains, if known.
      Returns:
      A String representation of this Output.
    • fullEquals

      public boolean fullEquals(Regressor other)
      Description copied from interface: Output
      Compares other to this output. Uses all score values and the strings.
      Specified by:
      fullEquals in interface Output<Regressor>
      Parameters:
      other - Another output instance.
      Returns:
      True if the other instance has value equality to this instance. False otherwise.
    • fullEquals

      public boolean fullEquals(Regressor other, double tolerance)
      Description copied from interface: Output
      Compares other to this output. Uses all score values and the strings.

      The default implementation of this method ignores the tolerance for compatibility reasons, it is overridden in all output classes in Tribuo.

      Specified by:
      fullEquals in interface Output<Regressor>
      Parameters:
      other - Another output instance.
      tolerance - The tolerance level for an absolute value comparison.
      Returns:
      True if the other instance has value equality to this instance. False otherwise.
    • equals

      public boolean equals(Object o)
      Regressors are equal if they have the same number of dimensions and equal dimension names.
      Overrides:
      equals in class Object
      Parameters:
      o - An object.
      Returns:
      True if Object is a Regressor with the same dimension names, false otherwise.
    • hashCode

      public int hashCode()
      Regressor's hashcode is based on the hash of the dimension names.

      It's cached on first access.

      Overrides:
      hashCode in class Object
      Returns:
      A hashcode.
    • getDimensionNamesString

      public String getDimensionNamesString()
      Returns a comma separated list of the dimension names.
      Returns:
      The dimension names comma separated.
    • getDimensionNamesString

      public String getDimensionNamesString(char separator)
      Returns a delimiter separated list of the dimension names.
      Parameters:
      separator - The separator to use.
      Returns:
      The dimension names.
    • extractNames

      public static String[] extractNames(OutputInfo<Regressor> info)
      Extracts the names from the supplied Regressor domain in their canonical order.
      Parameters:
      info - The OutputInfo to use.
      Returns:
      The dimension names from this domain.
    • parseString

      public static Regressor parseString(String s)
      Parses a string of the form:
       dimension-name=output,...,dimension-name=output
       
      or
       output,...,output
       
      where output must be readable by Double.parseDouble(java.lang.String). If there are no dimension names specified then they are automatically generated based on the position in the string.
      Parameters:
      s - The string form of a multiple regressor.
      Returns:
      A regressor parsed from the input string.
    • parseString

      public static Regressor parseString(String s, char splitChar)
      Parses a string of the form:
       dimension-name=output<splitChar>...<splitChar>dimension-name=output
       
      or
       output<splitChar>...<splitChar>output
       
      where output must be readable by Double.parseDouble(java.lang.String). If there are no dimension names specified then they are automatically generated based on the position in the string.
      Parameters:
      s - The string form of a regressor.
      splitChar - The char to split on.
      Returns:
      A regressor parsed from the input string.
    • parseElement

      public static com.oracle.labs.mlrg.olcut.util.Pair<String,Double> parseElement(int idx, String s)
      Parses a string of the form:
       dimension-name=output-double
       
      or
       output-double
       
      where the output must be readable by Double.parseDouble(java.lang.String). If there is no dimension name then one is constructed by appending idx to DEFAULT_NAME.
      Parameters:
      idx - The index of this string in a list.
      s - The string form of a single dimension from a regressor.
      Returns:
      A tuple representing the dimension name and the value.
    • createFromPairList

      public static Regressor createFromPairList(List<com.oracle.labs.mlrg.olcut.util.Pair<String,Double>> dimensions)
      Creates a Regressor from a list of dimension tuples.
      Parameters:
      dimensions - The dimensions to use.
      Returns:
      A Regressor representing these dimensions.