org.tribuo.classification.sgd.crf.CRFModel

All Implemented Interfaces:: com.oracle.labs.mlrg.olcut.provenance.Provenancable<ModelProvenance>, Serializable

public class CRFModel extends ConfidencePredictingSequenceModel

An inference time model for a linear chain CRF trained using SGD.

Can be switched to use Viterbi, belief propagation, or constrained BP at test time. By default it uses Viterbi.

See:

 Lafferty J, McCallum A, Pereira FC.
 "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data"
 Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001).

See Also:

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static enum

CRFModel.ConfidenceType

The type of subsequence level confidence to predict.

Nested classes/interfaces inherited from class org.tribuo.classification.sequence.ConfidencePredictingSequenceModel
ConfidencePredictingSequenceModel.Subsequence
Field Summary

Fields inherited from class org.tribuo.sequence.SequenceModel
featureIDMap, name, outputIDMap, provenanceOutput
Method Summary

Modifier and Type

Method

Description

static com.oracle.labs.mlrg.olcut.util.Pair<int[], SparseVector[]>

convert(SequenceExample<Label> example, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<Label> labelIDMap)

Deprecated.
As it's replaced with convertToVector(org.tribuo.sequence.SequenceExample<T>, org.tribuo.ImmutableFeatureMap) which is more flexible.

static <T extends Output<T>> SparseVector[]

convert(SequenceExample<T> example, ImmutableFeatureMap featureIDMap)

Deprecated.
As it's replaced with convertToVector(org.tribuo.sequence.SequenceExample<T>, org.tribuo.ImmutableFeatureMap) which is more flexible.

static com.oracle.labs.mlrg.olcut.util.Pair<int[], SGDVector[]>

convertToVector(SequenceExample<Label> example, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<Label> labelIDMap)

Converts a SequenceExample into an array of SGDVectors and labels suitable for CRF prediction.

static <T extends Output<T>> SGDVector[]

convertToVector(SequenceExample<T> example, ImmutableFeatureMap featureIDMap)

Converts a SequenceExample into an array of SGDVectors suitable for CRF prediction.

String

generateWeightsString()

Generates a human readable string containing all the weights in this model.

DenseVector

getFeatureWeights(int featureID)

Get a copy of the weights for feature featureID.

DenseVector

getFeatureWeights(String featureName)

Get a copy of the weights for feature named featureName.

Map<String, List<com.oracle.labs.mlrg.olcut.util.Pair<String,Double>>>

getTopFeatures(int n)

Gets the top n features associated with this model.

List<Prediction<Label>>

predict(SequenceExample<Label> example)

Uses the model to predict the output for a single example.

List<Double>

scoreChunks(SequenceExample<Label> example, List<Chunk> chunks)

Scores the chunks using constrained belief propagation.

<SUB extends ConfidencePredictingSequenceModel.Subsequence> List<Double>

scoreSubsequences(SequenceExample<Label> example, List<Prediction<Label>> predictions, List<SUB> subsequences)

The scoring function for the subsequences.

void

setConfidenceType(CRFModel.ConfidenceType type)

Sets the inference method used for confidence prediction.

Methods inherited from class org.tribuo.classification.sequence.ConfidencePredictingSequenceModel
multiplyWeights

Methods inherited from class org.tribuo.sequence.SequenceModel
getFeatureIDMap, getName, getOutputIDInfo, getProvenance, predict, predict, setName, toMaxLabels, toString, validate

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Method Details
- setConfidenceType
  
  public void setConfidenceType(CRFModel.ConfidenceType type)
  
  Sets the inference method used for confidence prediction. If CONSTRAINED_BP uses the constrained belief propagation algorithm from Culotta and McCallum 2004, if MULTIPLY multiplies the maximum marginal for each token, if NONE uses Viterbi.
  
  Parameters:
  
  type - Enum specifying the confidence type.
- getFeatureWeights
  
  public DenseVector getFeatureWeights(int featureID)
  
  Get a copy of the weights for feature featureID.
  
  Parameters:
  
  featureID - The feature ID.
  
  Returns:
  
  The per class weights.
- getFeatureWeights
  
  public DenseVector getFeatureWeights(String featureName)
  
  Get a copy of the weights for feature named featureName.
  
  Parameters:
  
  featureName - The feature name.
  
  Returns:
  
  The per class weights.
- predict
  
  public List<Prediction<Label>> predict(SequenceExample<Label> example)
  
  Description copied from class: SequenceModel
  
  Uses the model to predict the output for a single example.
  
  Specified by:
  
  predict in class SequenceModel<Label>
  
  Parameters:
  
  example - the example to predict.
  
  Returns:
  
  the result of the prediction.
- getTopFeatures
  
  public Map<String, List<com.oracle.labs.mlrg.olcut.util.Pair<String,Double>>> getTopFeatures(int n)
  
  Description copied from class: SequenceModel
  
  Gets the top n features associated with this model.
  If the model does not produce per output feature lists, it returns a map with a single element with key Model.ALL_OUTPUTS.
  
  If the model cannot describe it's top features then it returns Collections.emptyMap().
  
  Specified by:
  
  getTopFeatures in class SequenceModel<Label>
  
  Parameters:
  
  n - the number of features to return. If this value is less than 0, all features should be returned for each class, unless the model cannot score it's features.
  
  Returns:
  
  a map from string outputs to an ordered list of pairs of feature names and weights associated with that feature in the model
- scoreSubsequences
  
  public <SUB extends ConfidencePredictingSequenceModel.Subsequence> List<Double> scoreSubsequences(SequenceExample<Label> example, List<Prediction<Label>> predictions, List<SUB> subsequences)
  
  Description copied from class: ConfidencePredictingSequenceModel
  
  The scoring function for the subsequences. Provides the scores which should be assigned to each subsequence.
  
  Specified by:
  
  scoreSubsequences in class ConfidencePredictingSequenceModel
  
  Type Parameters:
  
  SUB - The subsequence type.
  
  Parameters:
  
  example - The input sequence example.
  
  predictions - The predictions produced by this model.
  
  subsequences - The subsequences to score.
  
  Returns:
  
  The scores for the subsequences.
- scoreChunks
  
  public List<Double> scoreChunks(SequenceExample<Label> example, List<Chunk> chunks)
  
  Scores the chunks using constrained belief propagation.
  
  Parameters:
  
  example - The example to score.
  
  chunks - The predicted chunks.
  
  Returns:
  
  The scores.
- generateWeightsString
  
  public String generateWeightsString()
  
  Generates a human readable string containing all the weights in this model.
  
  Returns:
  
  A string containing all the weight values.
- convert
  
  @Deprecated public static <T extends Output<T>> SparseVector[] convert(SequenceExample<T> example, ImmutableFeatureMap featureIDMap)
  
  Deprecated.
  As it's replaced with convertToVector(org.tribuo.sequence.SequenceExample<T>, org.tribuo.ImmutableFeatureMap) which is more flexible.
  
  Converts a SequenceExample into an array of SparseVectors suitable for CRF prediction.
  
  Type Parameters:
  
  T - The type parameter of the sequence example.
  
  Parameters:
  
  example - The sequence example to convert
  
  featureIDMap - The feature id map, used to discover the number of features.
  
  Returns:
  
  An array of SparseVector.
- convert
  
  @Deprecated public static com.oracle.labs.mlrg.olcut.util.Pair<int[], SparseVector[]> convert(SequenceExample<Label> example, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<Label> labelIDMap)
  
  Deprecated.
  As it's replaced with convertToVector(org.tribuo.sequence.SequenceExample<T>, org.tribuo.ImmutableFeatureMap) which is more flexible.
  
  Converts a SequenceExample into an array of SparseVectors and labels suitable for CRF prediction.
  
  Parameters:
  
  example - The sequence example to convert
  
  featureIDMap - The feature id map, used to discover the number of features.
  
  labelIDMap - The label id map, used to get the index of the labels.
  
  Returns:
  
  A Pair of an int array of labels and an array of SparseVector.
- convertToVector
  
  public static <T extends Output<T>> SGDVector[] convertToVector(SequenceExample<T> example, ImmutableFeatureMap featureIDMap)
  
  Converts a SequenceExample into an array of SGDVectors suitable for CRF prediction.
  
  Type Parameters:
  
  T - The type parameter of the sequence example.
  
  Parameters:
  
  example - The sequence example to convert
  
  featureIDMap - The feature id map, used to discover the number of features.
  
  Returns:
  
  An array of SGDVector.
- convertToVector
  
  public static com.oracle.labs.mlrg.olcut.util.Pair<int[], SGDVector[]> convertToVector(SequenceExample<Label> example, ImmutableFeatureMap featureIDMap, ImmutableOutputInfo<Label> labelIDMap)
  
  Converts a SequenceExample into an array of SGDVectors and labels suitable for CRF prediction.
  
  Parameters:
  
  example - The sequence example to convert
  
  featureIDMap - The feature id map, used to discover the number of features.
  
  labelIDMap - The label id map, used to get the index of the labels.
  
  Returns:
  
  A Pair of an int array of labels and an array of SparseVector.

Class CRFModel

Nested Class Summary

Nested classes/interfaces inherited from class org.tribuo.classification.sequence.ConfidencePredictingSequenceModel

Field Summary

Fields inherited from class org.tribuo.sequence.SequenceModel

Method Summary

Methods inherited from class org.tribuo.classification.sequence.ConfidencePredictingSequenceModel

Methods inherited from class org.tribuo.sequence.SequenceModel

Methods inherited from class java.lang.Object

Method Details

setConfidenceType

getFeatureWeights

getFeatureWeights

predict

getTopFeatures

scoreSubsequences

scoreChunks

generateWeightsString

convert

convert

convertToVector

convertToVector