Package org.tribuo.data
Class DatasetExplorer
java.lang.Object
org.tribuo.data.DatasetExplorer
- All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.command.CommandGroup
public final class DatasetExplorer
extends Object
implements com.oracle.labs.mlrg.olcut.command.CommandGroup
A CLI for exploring a serialised
Dataset
.-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Command line options. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionfeatureInfo
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci, String featureName) Shows information on a particular feature.org.jline.reader.Completer[]
The filename completer.getName()
loadDataset
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci, File path, boolean protobuf) Loads a serialized dataset.static void
Runs a dataset explorer.minCount
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci, int minCount) Shows the number of features which occurred more than minCount times in the dataset.numExamples
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the number of examples in this dataset.numFeatures
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the number of features in this dataset.outputInfo
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the output information.Saves out the dataset as a CSV file.showOutputStats
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the output statistics.showProvenance
(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the dataset provenance.void
Start the command shell
-
Constructor Details
-
DatasetExplorer
public DatasetExplorer()Constructs a dataset explorer.
-
-
Method Details
-
getName
- Specified by:
getName
in interfacecom.oracle.labs.mlrg.olcut.command.CommandGroup
-
getDescription
- Specified by:
getDescription
in interfacecom.oracle.labs.mlrg.olcut.command.CommandGroup
-
fileCompleter
public org.jline.reader.Completer[] fileCompleter()The filename completer.- Returns:
- The completer array.
-
startShell
public void startShell()Start the command shell -
loadDataset
@Command(usage="<filename> <is-protobuf> - Load a dataset from disk.", completers="fileCompleter") public String loadDataset(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci, File path, boolean protobuf) Loads a serialized dataset.- Parameters:
ci
- The command interpreter.path
- The path to load.protobuf
- Load the model from protobuf?- Returns:
- A status string.
-
featureInfo
@Command(usage="Shows the information on a particular feature") public String featureInfo(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci, String featureName) Shows information on a particular feature.- Parameters:
ci
- The command interpreter.featureName
- The feature name.- Returns:
- The feature information.
-
outputInfo
@Command(usage="Shows the output information.") public String outputInfo(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the output information.- Parameters:
ci
- The command interpreter.- Returns:
- The output information.
-
numExamples
@Command(usage="Shows the number of rows in the dataset") public String numExamples(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the number of examples in this dataset.- Parameters:
ci
- The command interpreter.- Returns:
- The number of examples.
-
numFeatures
@Command(usage="Shows the number of features in the dataset") public String numFeatures(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the number of features in this dataset.- Parameters:
ci
- The command interpreter.- Returns:
- The number of features.
-
minCount
@Command(usage="<min count> - Shows the number of features that occurred more than min count times.") public String minCount(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci, int minCount) Shows the number of features which occurred more than minCount times in the dataset.- Parameters:
ci
- The command interpreter.minCount
- The minimum occurrence count.- Returns:
- The number of features which occurred more than minCount times.
-
showOutputStats
@Command(usage="Shows the output statistics") public String showOutputStats(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the output statistics.- Parameters:
ci
- The command interpreter.- Returns:
- The output statistics string.
-
saveCSV
@Command(usage="Saves out the data as a CSV.") public String saveCSV(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci, String path) Saves out the dataset as a CSV file.- Parameters:
ci
- The command interpreter.path
- The path to save to.- Returns:
- A status message.
-
showProvenance
@Command(usage="Shows the dataset provenance") public String showProvenance(com.oracle.labs.mlrg.olcut.command.CommandInterpreter ci) Shows the dataset provenance.- Parameters:
ci
- The command interpreter.- Returns:
- The dataset provenance string.
-
main
Runs a dataset explorer.- Parameters:
args
- CLI arguments.
-