@InterfaceAudience.User @InterfaceStability.Unstable public class CarbonWriterBuilder extends Object
CarbonWriter| Constructor and Description |
|---|
CarbonWriterBuilder() |
| Modifier and Type | Method and Description |
|---|---|
CarbonWriter |
build()
Build a
CarbonWriter
This writer is not thread safe,
use withThreadSafe() configuration in multi thread environment |
org.apache.carbondata.processing.loading.model.CarbonLoadModel |
buildLoadModel(Schema carbonSchema) |
CarbonWriterBuilder |
enableLocalDictionary(boolean enableLocalDictionary) |
CarbonWriterBuilder |
invertedIndexFor(String[] invertedIndexColumns)
sets the list of columns for which inverted index needs to generated
|
CarbonWriterBuilder |
localDictionaryThreshold(int localDictionaryThreshold) |
CarbonWriterBuilder |
outputPath(String path)
Sets the output path of the writer builder
|
CarbonWriterBuilder |
sortBy(String[] sortColumns)
sets the list of columns that needs to be in sorted order
|
CarbonWriterBuilder |
taskNo(long taskNo)
sets the taskNo for the writer.
|
CarbonWriterBuilder |
uniqueIdentifier(long timestamp)
to set the timestamp in the carbondata and carbonindex index files
|
CarbonWriterBuilder |
withAvroInput(org.apache.avro.Schema avroSchema)
to build a
CarbonWriter, which accepts Avro object |
CarbonWriterBuilder |
withBlockletSize(int blockletSize)
To set the blocklet size of CarbonData file
|
CarbonWriterBuilder |
withBlockSize(int blockSize)
To set the carbondata file size in MB between 1MB-2048MB
|
CarbonWriterBuilder |
withCsvInput(Schema schema)
to build a
CarbonWriter, which accepts row in CSV format |
CarbonWriterBuilder |
withCsvInput(String jsonSchema)
to build a
CarbonWriter, which accepts row in CSV format |
CarbonWriterBuilder |
withHadoopConf(org.apache.hadoop.conf.Configuration conf)
To support hadoop configuration
|
CarbonWriterBuilder |
withHadoopConf(String key,
String value)
Updates the hadoop configuration with the given key value
|
CarbonWriterBuilder |
withJsonInput(Schema carbonSchema)
to build a
CarbonWriter, which accepts Json object |
CarbonWriterBuilder |
withLoadOption(String key,
String value)
To support the load options for sdk writer
|
CarbonWriterBuilder |
withLoadOptions(Map<String,String> options)
To support the load options for sdk writer
|
CarbonWriterBuilder |
withTableProperties(Map<String,String> options)
To support the table properties for sdk writer
|
CarbonWriterBuilder |
withTableProperty(String key,
String value)
To support the table properties for sdk writer
|
CarbonWriterBuilder |
withThreadSafe(short numOfThreads)
To make sdk writer thread safe.
|
CarbonWriterBuilder |
writtenBy(String appName) |
public CarbonWriterBuilder outputPath(String path)
path - is the absolute path where output files are written
This method must be called when building CarbonWriterBuilderpublic CarbonWriterBuilder sortBy(String[] sortColumns)
sortColumns - is a string array of columns that needs to be sorted.
If it is null or by default all dimensions are selected for sorting
If it is empty array, no columns are sortedpublic CarbonWriterBuilder invertedIndexFor(String[] invertedIndexColumns)
invertedIndexColumns - is a string array of columns for which inverted index needs to
generated.
If it is null or an empty array, inverted index will be generated for none of the columnspublic CarbonWriterBuilder taskNo(long taskNo)
taskNo - is the TaskNo user wants to specify.
by default it is system time in nano seconds.public CarbonWriterBuilder uniqueIdentifier(long timestamp)
timestamp - is a timestamp to be used in the carbondata and carbonindex index files.
By default set to zero.public CarbonWriterBuilder withLoadOptions(Map<String,String> options)
options - key,value pair of load options.
supported keys values are
a. bad_records_logger_enable -- true (write into separate logs), false
b. bad_records_action -- FAIL, FORCE, IGNORE, REDIRECT
c. bad_record_path -- path
d. dateformat -- same as JAVA SimpleDateFormat
e. timestampformat -- same as JAVA SimpleDateFormat
f. complex_delimiter_level_1 -- value to Split the complexTypeData
g. complex_delimiter_level_2 -- value to Split the nested complexTypeData
h. quotechar
i. escapechar
Default values are as follows.
a. bad_records_logger_enable -- "false"
b. bad_records_action -- "FAIL"
c. bad_record_path -- ""
d. dateformat -- "" , uses from carbon.properties file
e. timestampformat -- "", uses from carbon.properties file
f. complex_delimiter_level_1 -- "\001"
g. complex_delimiter_level_2 -- "\002"
h. quotechar -- "\""
i. escapechar -- "\\"public CarbonWriterBuilder withLoadOption(String key, String value)
key - the key of load optionvalue - the value of load optionpublic CarbonWriterBuilder withTableProperties(Map<String,String> options)
options - key,value pair of create table properties.
supported keys values are
a. table_blocksize -- [1-2048] values in MB. Default value is 1024
b. table_blocklet_size -- values in MB. Default value is 64 MB
c. local_dictionary_threshold -- positive value, default is 10000
d. local_dictionary_enable -- true / false. Default is false
e. sort_columns -- comma separated column. "c1,c2". Default all dimensions are sorted.
If empty string "" is passed. No columns are sorted
j. sort_scope -- "local_sort", "no_sort", "batch_sort". default value is "local_sort"
k. long_string_columns -- comma separated string columns which are more than 32k length.
default value is null.
l. inverted_index -- comma separated string columns for which inverted index needs to be
generatedpublic CarbonWriterBuilder withTableProperty(String key, String value)
key - property keyvalue - property valuepublic CarbonWriterBuilder withThreadSafe(short numOfThreads)
numOfThreads - should number of threads in which writer is called in multi-thread scenario
default sdk writer is not thread safe.
can use one writer instance in one thread only.public CarbonWriterBuilder withHadoopConf(org.apache.hadoop.conf.Configuration conf)
conf - hadoop configuration support, can set s3a AK,SK,end point and other conf with thispublic CarbonWriterBuilder withHadoopConf(String key, String value)
key - key wordvalue - valuepublic CarbonWriterBuilder withBlockSize(int blockSize)
blockSize - is size in MB between 1MB to 2048 MB
default value is 1024 MBpublic CarbonWriterBuilder localDictionaryThreshold(int localDictionaryThreshold)
localDictionaryThreshold - is localDictionaryThreshold, default is 10000public CarbonWriterBuilder writtenBy(String appName)
appName - appName which is writing the carbondata filespublic CarbonWriterBuilder enableLocalDictionary(boolean enableLocalDictionary)
enableLocalDictionary - enable local dictionary, default is falsepublic CarbonWriterBuilder withBlockletSize(int blockletSize)
blockletSize - is blocklet size in MB
default value is 64 MBpublic CarbonWriterBuilder withCsvInput(Schema schema)
CarbonWriter, which accepts row in CSV formatschema - carbon Schema object {org.apache.carbondata.sdk.file.Schema}public CarbonWriterBuilder withCsvInput(String jsonSchema)
CarbonWriter, which accepts row in CSV formatjsonSchema - json Schema stringpublic CarbonWriterBuilder withAvroInput(org.apache.avro.Schema avroSchema)
CarbonWriter, which accepts Avro objectavroSchema - avro Schema object {org.apache.avro.Schema}public CarbonWriterBuilder withJsonInput(Schema carbonSchema)
CarbonWriter, which accepts Json objectcarbonSchema - carbon Schema objectpublic CarbonWriter build() throws IOException, org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException
CarbonWriter
This writer is not thread safe,
use withThreadSafe() configuration in multi thread environmentIOExceptionorg.apache.carbondata.common.exceptions.sql.InvalidLoadOptionExceptionpublic org.apache.carbondata.processing.loading.model.CarbonLoadModel buildLoadModel(Schema carbonSchema) throws IOException, org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException
IOExceptionorg.apache.carbondata.common.exceptions.sql.InvalidLoadOptionExceptionCopyright © 2016–2019 The Apache Software Foundation. All rights reserved.