Class HashedFeatureMap

All Implemented Interfaces:
Serializable, Iterable<VariableInfo>, ProtoSerializable<org.tribuo.protos.core.FeatureDomainProto>

public final class HashedFeatureMap extends ImmutableFeatureMap
A FeatureMap used by the HashingTrainer to provide feature name hashing and guarantee that the Model does not contain feature name information, but still works with unhashed features names.

The salt must be set after this object has been deserialized.

See Also:
  • Field Details

    • CURRENT_VERSION

      public static final int CURRENT_VERSION
      Protobuf serialization version.
      See Also:
  • Method Details

    • deserializeFromProto

      public static HashedFeatureMap deserializeFromProto(int version, String className, com.google.protobuf.Any message) throws com.google.protobuf.InvalidProtocolBufferException
      Deserialization factory.
      Parameters:
      version - The serialized object version.
      className - The class name.
      message - The serialized data.
      Returns:
      The deserialized object.
      Throws:
      com.google.protobuf.InvalidProtocolBufferException - If the protobuf could not be parsed from the message.
    • get

      public VariableIDInfo get(String name)
      Description copied from class: ImmutableFeatureMap
      Gets the VariableIDInfo for this name. Returns null if it's unknown.
      Overrides:
      get in class ImmutableFeatureMap
      Parameters:
      name - The name to lookup.
      Returns:
      The VariableInfo, or null.
    • getID

      public int getID(String name)
      Gets the id number for this feature, returns -1 if it's unknown.
      Overrides:
      getID in class ImmutableFeatureMap
      Parameters:
      name - The name of the feature.
      Returns:
      A non-negative integer if the feature is known, -1 otherwise.
    • setSalt

      public void setSalt(String salt)
      The salt is not serialised with the Model. It must be set after deserialisation to the same value from training time.

      If the salt is invalid it will throw IllegalArgumentException.

      Parameters:
      salt - The salt value. Must be the same as the one from training time.
    • generateHashedFeatureMap

      public static HashedFeatureMap generateHashedFeatureMap(FeatureMap map, Hasher hasher)
      Converts a standard FeatureMap by hashing each entry using the supplied hash function Hasher.

      This preserves the index ordering of the original feature names, which is important for making sure test time performance is good.

      It guarantees any collisions will produce a feature id number lower than the previous feature's number, and so can be easily removed.

      Parameters:
      map - The FeatureMap to hash.
      hasher - The hashing function.
      Returns:
      A HashedFeatureMap.