Class CRFModel

All Implemented Interfaces:<ModelProvenance>, Serializable, ProtoSerializable<org.tribuo.protos.core.SequenceModelProto>

public class CRFModel extends ConfidencePredictingSequenceModel
An inference time model for a linear chain CRF trained using SGD.

Can be switched to use Viterbi, belief propagation, or constrained BP at test time. By default it uses Viterbi.


 Lafferty J, McCallum A, Pereira FC.
 "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data"
 Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001).
See Also:
  • Field Details


      public static final int CURRENT_VERSION
      Protobuf serialization version.
      See Also:
  • Method Details

    • deserializeFromProto

      public static CRFModel deserializeFromProto(int version, String className, message) throws
      Deserialization factory.
      version - The serialized object version.
      className - The class name.
      message - The serialized data.
      The deserialized object.
      Throws: - If the protobuf could not be parsed from the message.
    • setConfidenceType

      public void setConfidenceType(CRFModel.ConfidenceType type)
      Sets the inference method used for confidence prediction. If CONSTRAINED_BP uses the constrained belief propagation algorithm from Culotta and McCallum 2004, if MULTIPLY multiplies the maximum marginal for each token, if NONE uses Viterbi.
      type - Enum specifying the confidence type.
    • getFeatureWeights

      public DenseVector getFeatureWeights(int featureID)
      Get a copy of the weights for feature featureID.
      featureID - The feature ID.
      The per class weights.
    • getFeatureWeights

      public DenseVector getFeatureWeights(String featureName)
      Get a copy of the weights for feature named featureName.
      featureName - The feature name.
      The per class weights.
    • predict

      public List<Prediction<Label>> predict(SequenceExample<Label> example)
      Description copied from class: SequenceModel
      Uses the model to predict the output for a single example.
      Specified by:
      predict in class SequenceModel<Label>
      example - the example to predict.
      the result of the prediction.
    • getTopFeatures

      public Map<String,List<<String,Double>>> getTopFeatures(int n)
      Description copied from class: SequenceModel
      Gets the top n features associated with this model.

      If the model does not produce per output feature lists, it returns a map with a single element with key Model.ALL_OUTPUTS.

      If the model cannot describe it's top features then it returns Collections.emptyMap().

      Specified by:
      getTopFeatures in class SequenceModel<Label>
      n - the number of features to return. If this value is less than 0, all features should be returned for each class, unless the model cannot score its features.
      a map from string outputs to an ordered list of pairs of feature names and weights associated with that feature in the model
    • scoreSubsequences

      public <SUB extends ConfidencePredictingSequenceModel.Subsequence> List<Double> scoreSubsequences(SequenceExample<Label> example, List<Prediction<Label>> predictions, List<SUB> subsequences)
      Description copied from class: ConfidencePredictingSequenceModel
      The scoring function for the subsequences. Provides the scores which should be assigned to each subsequence.
      Specified by:
      scoreSubsequences in class ConfidencePredictingSequenceModel
      Type Parameters:
      SUB - The subsequence type.
      example - The input sequence example.
      predictions - The predictions produced by this model.
      subsequences - The subsequences to score.
      The scores for the subsequences.
    • scoreChunks

      public List<Double> scoreChunks(SequenceExample<Label> example, List<Chunk> chunks)
      Scores the chunks using constrained belief propagation.
      example - The example to score.
      chunks - The predicted chunks.
      The scores.
    • generateWeightsString

      public String generateWeightsString()
      Generates a human readable string containing all the weights in this model.
      A string containing all the weight values.
    • serialize

      public org.tribuo.protos.core.SequenceModelProto serialize()
      Description copied from interface: ProtoSerializable
      Serializes this object to a protobuf.
      Specified by:
      serialize in interface ProtoSerializable<org.tribuo.protos.core.SequenceModelProto>
      serialize in class SequenceModel<Label>
      The protobuf.
    • convert

      @Deprecated public static <T extends Output<T>> SparseVector[] convert(SequenceExample<T> example, ImmutableFeatureMap featureIDMap)
      Converts a SequenceExample into an array of SparseVectors suitable for CRF prediction.
      Type Parameters:
      T - The type parameter of the sequence example.
      example - The sequence example to convert
      featureIDMap - The feature id map, used to discover the number of features.
      An array of SparseVector.
    • convert

      @Deprecated public static<int[],SparseVector[]> convert(SequenceExample<Label> example, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<Label> labelIDMap)
      Converts a SequenceExample into an array of SparseVectors and labels suitable for CRF prediction.
      example - The sequence example to convert
      featureIDMap - The feature id map, used to discover the number of features.
      labelIDMap - The label id map, used to get the index of the labels.
      A Pair of an int array of labels and an array of SparseVector.
    • convertToVector

      public static <T extends Output<T>> SGDVector[] convertToVector(SequenceExample<T> example, ImmutableFeatureMap featureIDMap)
      Converts a SequenceExample into an array of SGDVectors suitable for CRF prediction.
      Type Parameters:
      T - The type parameter of the sequence example.
      example - The sequence example to convert
      featureIDMap - The feature id map, used to discover the number of features.
      An array of SGDVector.
    • convertToVector

      public static<int[],SGDVector[]> convertToVector(SequenceExample<Label> example, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<Label> labelIDMap)
      Converts a SequenceExample into an array of SGDVectors and labels suitable for CRF prediction.
      example - The sequence example to convert
      featureIDMap - The feature id map, used to discover the number of features.
      labelIDMap - The label id map, used to get the index of the labels.
      A Pair of an int array of labels and an array of SparseVector.