Class CSVDataSource<T extends Output<T>>

java.lang.Object
org.tribuo.data.columnar.ColumnarDataSource<T>
org.tribuo.data.csv.CSVDataSource<T>
All Implemented Interfaces:
com.oracle.labs.mlrg.olcut.config.Configurable, com.oracle.labs.mlrg.olcut.provenance.Provenancable<DataSourceProvenance>, Iterable<Example<T>>, ConfigurableDataSource<T>, DataSource<T>

public class CSVDataSource<T extends Output<T>> extends ColumnarDataSource<T>
A DataSource for loading separable data from a text file (e.g., CSV, TSV) and applying FieldProcessors to it.
  • Constructor Details

    • CSVDataSource

      public CSVDataSource(Path dataPath, RowProcessor<T> rowProcessor, boolean outputRequired)
      Creates a CSVDataSource using the specified RowProcessor to process the data.

      Uses ',' as the separator, '"' as the quote character, and '\' as the escape character.

      Parameters:
      dataPath - The Path to the data file.
      rowProcessor - The row processor which converts a row into an Example.
      outputRequired - Is the output required to exist in the data file.
    • CSVDataSource

      public CSVDataSource(URI dataFile, RowProcessor<T> rowProcessor, boolean outputRequired)
      Creates a CSVDataSource using the specified RowProcessor to process the data.

      Uses ',' as the separator, '"' as the quote character, and '\' as the escape character.

      Parameters:
      dataFile - A URI for the data file.
      rowProcessor - The row processor which converts a row into an Example.
      outputRequired - Is the output required to exist in the data file.
    • CSVDataSource

      public CSVDataSource(Path dataPath, RowProcessor<T> rowProcessor, boolean outputRequired, char separator)
      Creates a CSVDataSource using the specified RowProcessor to process the data.

      Uses '"' as the quote character, and '\' as the escape character.

      Parameters:
      dataPath - The Path to the data file.
      rowProcessor - The row processor which converts a row into an Example.
      outputRequired - Is the output required to exist in the data file.
      separator - The separator character in the data file.
    • CSVDataSource

      public CSVDataSource(URI dataFile, RowProcessor<T> rowProcessor, boolean outputRequired, char separator)
      Creates a CSVDataSource using the specified RowProcessor to process the data.

      Uses '"' as the quote character, and '\' as the escape character.

      Parameters:
      dataFile - A URI for the data file.
      rowProcessor - The row processor which converts a row into an Example.
      outputRequired - Is the output required to exist in the data file.
      separator - The separator character in the data file.
    • CSVDataSource

      public CSVDataSource(URI dataFile, RowProcessor<T> rowProcessor, boolean outputRequired, char separator, char quote)
      Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
      Parameters:
      dataFile - A URI for the data file.
      rowProcessor - The row processor which converts a row into an Example.
      outputRequired - Is the output required to exist in the data file.
      separator - The separator character in the data file.
      quote - The quote character in the data file.
    • CSVDataSource

      public CSVDataSource(Path dataPath, RowProcessor<T> rowProcessor, boolean outputRequired, char separator, char quote)
      Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
      Parameters:
      dataPath - The Path to the data file.
      rowProcessor - The row processor which converts a row into an Example.
      outputRequired - Is the output required to exist in the data file.
      separator - The separator character in the data file.
      quote - The quote character in the data file.
  • Method Details