Class CRFParameters

java.lang.Object
org.tribuo.classification.sgd.crf.CRFParameters
All Implemented Interfaces:
Serializable, Parameters, ProtoSerializable<org.tribuo.math.protos.ParametersProto>

public class CRFParameters extends Object implements Parameters, Serializable
A Parameters for training a CRF using SGD.
See Also:
  • Field Details

    • CURRENT_VERSION

      public static final int CURRENT_VERSION
      Protobuf serialization version.
      See Also:
  • Method Details

    • deserializeFromProto

      public static CRFParameters deserializeFromProto(int version, String className, com.google.protobuf.Any message) throws com.google.protobuf.InvalidProtocolBufferException
      Deserialization factory.
      Parameters:
      version - The serialized object version.
      className - The class name.
      message - The serialized data.
      Returns:
      The deserialized object.
      Throws:
      com.google.protobuf.InvalidProtocolBufferException - If the protobuf could not be parsed from the message.
    • serialize

      public org.tribuo.math.protos.ParametersProto serialize()
      Description copied from interface: ProtoSerializable
      Serializes this object to a protobuf.
      Specified by:
      serialize in interface ProtoSerializable<org.tribuo.math.protos.ParametersProto>
      Returns:
      The protobuf.
    • getFeatureWeights

      public DenseVector getFeatureWeights(int id)
      Gets a copy of the weights for the specified label id.
      Parameters:
      id - The label id.
      Returns:
      The feature weights.
    • getBias

      public double getBias(int id)
      Returns the bias for the specified label id.
      Parameters:
      id - The label id.
      Returns:
      The bias.
    • getWeight

      public double getWeight(int labelID, int featureID)
      Returns the feature/label weight for the specified feature and label id.
      Parameters:
      labelID - The label id.
      featureID - The feature id.
      Returns:
      The feature/label weight.
    • getLocalScores

      public DenseVector[] getLocalScores(SGDVector[] features)
      Generate the local scores (i.e., the linear classifier for each token).
      Parameters:
      features - An array of SGDVectors, one per token.
      Returns:
      An array of DenseVectors representing the scores per label for each token.
    • getCliqueValues

      public ChainHelper.ChainCliqueValues getCliqueValues(SGDVector[] features)
      Generates the local scores and tuples them with the label - label transition weights.
      Parameters:
      features - The per token SGDVector of features.
      Returns:
      A tuple containing the array of DenseVector scores and the label - label transition weights.
    • predict

      public int[] predict(SGDVector[] features)
      Generate a prediction using Viterbi.
      Parameters:
      features - The per token SGDVector of features.
      Returns:
      An int array giving the predicted label per token.
    • predictMarginals

      public DenseVector[] predictMarginals(SGDVector[] features)
      Generate a prediction using Belief Propagation.
      Parameters:
      features - The per token SGDVector of features.
      Returns:
      A DenseVector per token containing the marginal distribution over labels.
    • predictConfidenceUsingCBP

      public List<Double> predictConfidenceUsingCBP(SGDVector[] features, List<Chunk> chunks)
      This predicts per chunk confidence using the constrained forward backward algorithm from Culotta and McCallum 2004.

      Runs one pass of BP to get the normalizing constant, and then a further chunks.size() passes of constrained forward backward.

      Parameters:
      features - The per token SGDVector of features.
      chunks - A list of extracted chunks to pin constrain the labels to.
      Returns:
      A list containing the confidence value for each chunk.
    • valueAndGradient

      public com.oracle.labs.mlrg.olcut.util.Pair<Double,Tensor[]> valueAndGradient(SGDVector[] features, int[] labels)
      Generates predictions based on the input features and labels, then scores those predictions to produce a loss for the example and a gradient update.

      Assumes all the features in this example are either SparseVector or DenseVector. Mixing the two will cause undefined behaviour.

      Parameters:
      features - The per token SGDVector of features.
      labels - The per token ground truth labels.
      Returns:
      A Pair containing the loss for this example and the associated gradient.
    • getEmptyCopy

      public Tensor[] getEmptyCopy()
      Returns a 3 element Tensor array. The first element is a DenseVector of label biases. The second element is a DenseMatrix of feature-label weights. The third element is a DenseMatrix of label-label transition weights.
      Specified by:
      getEmptyCopy in interface Parameters
      Returns:
      A Tensor array.
    • get

      public Tensor[] get()
      Description copied from interface: Parameters
      Get a reference to the underlying Tensor array.
      Specified by:
      get in interface Parameters
      Returns:
      The parameters.
    • set

      public void set(Tensor[] newWeights)
      Description copied from interface: Parameters
      Set the underlying Tensor array to newWeights.
      Specified by:
      set in interface Parameters
      Parameters:
      newWeights - New parameters to store in this object.
    • update

      public void update(Tensor[] gradients)
      Description copied from interface: Parameters
      Apply gradients to the parameters. Assumes that gradients is the same length as the parameters, and each Tensor is the same size as the corresponding one from the parameters.

      The gradients are added to the parameters.

      Specified by:
      update in interface Parameters
      Parameters:
      gradients - A Tensor array of updates, with the length equal to Parameters.get().length.
    • merge

      public Tensor[] merge(Tensor[][] gradients, int size)
      Description copied from interface: Parameters
      Merge together an array of gradient arrays. Assumes the first dimension is the number of gradient arrays and the second dimension is the number of parameter Tensors.

      For performance reasons this call may mutate the input gradient array, and may return a subset of those elements as the merge output.

      Specified by:
      merge in interface Parameters
      Parameters:
      gradients - An array of gradient update arrays.
      size - The number of elements of gradients to merge. Allows gradients to have unused elements.
      Returns:
      A single Tensor array of the summed gradients.