Package org.tribuo.clustering
Class ClusteringFactory
java.lang.Object
org.tribuo.clustering.ClusteringFactory
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable
,com.oracle.labs.mlrg.olcut.provenance.Provenancable<OutputFactoryProvenance>
,Serializable
,OutputFactory<ClusterID>
,ProtoSerializable<org.tribuo.protos.core.OutputFactoryProto>
A factory for making ClusterID related classes.
Parses the ClusterID by calling toString on the input then parsing it as an int.
- See Also:
-
Nested Class Summary
-
Field Summary
Modifier and TypeFieldDescriptionstatic final ClusterID
The sentinel unassigned cluster id, used when there is no ground truth clustering.Fields inherited from interface org.tribuo.protos.ProtoSerializable
DESERIALIZATION_METHOD_NAME, PROVENANCE_SERIALIZER
-
Constructor Summary
ConstructorDescriptionClusteringFactory is stateless and immutable, but we need to be able to construct them via the config system. -
Method Summary
Modifier and TypeMethodDescriptionconstructInfoForExternalModel
(Map<ClusterID, Integer> mapping) Unlike the other info types, clustering directly uses the integer IDs as the stored value, so this mapping discards the cluster IDs and just uses the supplied integers.static ClusteringFactory
deserializeFromProto
(int version, String className, com.google.protobuf.Any message) Deserialization factory.boolean
Generates the appropriateMutableOutputInfo
so the output values can be tracked by aDataset
or other aggregate.<V> ClusterID
generateOutput
(V label) Generates a ClusterID by calling toString on the input, then calling Integer.parseInt.Gets anEvaluator
suitable for measuring performance of predictions for the Output subclass.Gets the output class that this factory supports.Returns the singleton unknown output of type T which can be used for prediction time examples.int
hashCode()
org.tribuo.protos.core.OutputFactoryProto
Serializes this object to a protobuf.Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface com.oracle.labs.mlrg.olcut.config.Configurable
postConfig
Methods inherited from interface org.tribuo.OutputFactory
generateOutputs
-
Field Details
-
UNASSIGNED_CLUSTER_ID
The sentinel unassigned cluster id, used when there is no ground truth clustering.
-
-
Constructor Details
-
ClusteringFactory
public ClusteringFactory()ClusteringFactory is stateless and immutable, but we need to be able to construct them via the config system.
-
-
Method Details
-
deserializeFromProto
public static ClusteringFactory deserializeFromProto(int version, String className, com.google.protobuf.Any message) Deserialization factory.- Parameters:
version
- The serialized object version.className
- The class name.message
- The serialized data.- Returns:
- The deserialized object.
-
serialize
public org.tribuo.protos.core.OutputFactoryProto serialize()Description copied from interface:ProtoSerializable
Serializes this object to a protobuf.- Specified by:
serialize
in interfaceProtoSerializable<org.tribuo.protos.core.OutputFactoryProto>
- Returns:
- The protobuf.
-
generateOutput
Generates a ClusterID by calling toString on the input, then calling Integer.parseInt.- Specified by:
generateOutput
in interfaceOutputFactory<ClusterID>
- Type Parameters:
V
- The type of the input.- Parameters:
label
- An input value.- Returns:
- A ClusterID representing the data.
-
getUnknownOutput
Description copied from interface:OutputFactory
Returns the singleton unknown output of type T which can be used for prediction time examples.- Specified by:
getUnknownOutput
in interfaceOutputFactory<ClusterID>
- Returns:
- An unknown output.
-
generateInfo
Description copied from interface:OutputFactory
Generates the appropriateMutableOutputInfo
so the output values can be tracked by aDataset
or other aggregate.- Specified by:
generateInfo
in interfaceOutputFactory<ClusterID>
- Returns:
- The appropriate subclass of
MutableOutputInfo
initialised to zero.
-
constructInfoForExternalModel
Unlike the other info types, clustering directly uses the integer IDs as the stored value, so this mapping discards the cluster IDs and just uses the supplied integers.- Specified by:
constructInfoForExternalModel
in interfaceOutputFactory<ClusterID>
- Parameters:
mapping
- The mapping to use.- Returns:
- An
ImmutableOutputInfo
for the clustering.
-
getEvaluator
Description copied from interface:OutputFactory
Gets anEvaluator
suitable for measuring performance of predictions for the Output subclass.Evaluator
instances are thread safe and immutable, and commonly this is a singleton stored in theOutputFactory
implementation.- Specified by:
getEvaluator
in interfaceOutputFactory<ClusterID>
- Returns:
- An evaluator.
-
getTypeWitness
Description copied from interface:OutputFactory
Gets the output class that this factory supports.- Specified by:
getTypeWitness
in interfaceOutputFactory<ClusterID>
- Returns:
- The output class.
-
hashCode
public int hashCode() -
equals
-
getProvenance
- Specified by:
getProvenance
in interfacecom.oracle.labs.mlrg.olcut.provenance.Provenancable<OutputFactoryProvenance>
-