public final class BinningTransformation extends Object implements Transformation
Three binning types are implemented:
The equal frequency TransformStatistics
needs to
store all the observed feature values, and thus has much higher
memory usage than all other binning types.
The binned values are in the range [1, numBins].
Modifier and Type | Class and Description |
---|---|
static class |
BinningTransformation.BinningTransformationProvenance
Provenance for
BinningTransformation . |
static class |
BinningTransformation.BinningType
The allowed binning types.
|
Modifier and Type | Method and Description |
---|---|
TransformStatistics |
createStats()
Creates the statistics object for this Transformation.
|
static BinningTransformation |
equalFrequency(int numBins)
Returns a BinningTransformation which generates
bins which contain the same amount of training data
that is, each bin has an equal probability of occurrence
in the training data.
|
static BinningTransformation |
equalWidth(int numBins)
Returns a BinningTransformation which generates
fixed equal width bins between the observed min and max
values.
|
TransformationProvenance |
getProvenance() |
void |
postConfig()
Used by the OLCUT configuration system, and should not be called by external code.
|
static BinningTransformation |
stdDevs(int numDeviations)
Returns a BinningTransformation which generates bins
based on the observed standard deviation of the training
data.
|
String |
toString() |
public void postConfig()
postConfig
in interface com.oracle.labs.mlrg.olcut.config.Configurable
public TransformStatistics createStats()
Transformation
createStats
in interface Transformation
public TransformationProvenance getProvenance()
getProvenance
in interface com.oracle.labs.mlrg.olcut.provenance.Provenancable<TransformationProvenance>
public static BinningTransformation equalWidth(int numBins)
Values outside the observed range are clamped to either the minimum or maximum bin. Bins are numbered in the range [1,numBins].
numBins
- The number of bins to generate.public static BinningTransformation equalFrequency(int numBins)
Values outside the observed range are clamped to either the minimum or maximum bin. Bins are numbered in the range [1,numBins].
numBins
- The number of bins to generate.public static BinningTransformation stdDevs(int numDeviations)
Bins are numbered in the range [1,numDeviations*2]. The middle two bins are either side of the mean, the lowest bin is the mean minus numDeviations * observed standard deviation, the highest bin is the mean plus numDeviations * observed standard deviation.
numDeviations
- The number of standard deviations to bin.Copyright © 2015–2021 Oracle and/or its affiliates. All rights reserved.