Package org.tribuo.hash
package org.tribuo.hash
Provides the base interface and implementations of the
Model
hashing
which obscures the feature names stored in a model.
The base interface is Hasher
, which has three implementations:
HashCodeHasher
which uses String.hashCode(),
ModHashCodeHasher
which uses String.hashCode() and remaps the output into
a specific range, and MessageDigestHasher
which uses a
MessageDigest
implementation to perform the hashing. Only MessageDigestHasher provides
security guarantees suitable for production usage.
Note Hasher
implementations require a salt which is serialized separately
from the Hasher
object. Without supplying this seed the hash methods will throw
IllegalStateException
.
-
ClassDescriptionHashes names using String.hashCode().Provenance for the
HashCodeHasher
.AFeatureMap
used by theHashingTrainer
to provide feature name hashing and guarantee that theModel
does not contain feature name information, but still works with unhashed features names.An abstract base class for hash functions used to hash the names of features.An Options implementation which provides CLI arguments for the model hashing functionality.Supported types of hashes in CLI programs.HashingTrainer<T extends Output<T>>Hashes Strings using the supplied MessageDigest type.Provenance forMessageDigestHasher
.Hashes names using String.hashCode(), then reduces the dimension.Provenance for theModHashCodeHasher
.