@Beta public interface SparkExecutionPluginContext extends DatasetContext, TransformContext
| Modifier and Type | Method and Description |
|---|---|
SparkInterpreter |
createSparkInterpreter()
Creates a new instance of
SparkInterpreter for Scala code compilation and interpretation. |
<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> |
fromDataset(String datasetName)
Creates a
JavaPairRDD from the given Dataset. |
<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> |
fromDataset(String datasetName,
Map<String,String> arguments)
Creates a
JavaPairRDD from the given Dataset with the given set of Dataset arguments. |
<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> |
fromDataset(String datasetName,
Map<String,String> arguments,
Iterable<? extends Split> splits)
|
org.apache.spark.api.java.JavaRDD<StreamEvent> |
fromStream(String streamName)
Creates a
JavaRDD that represents all events from the given stream. |
<V> org.apache.spark.api.java.JavaPairRDD<Long,V> |
fromStream(String streamName,
Class<V> valueType)
Creates a
JavaPairRDD that represents all events from the given stream. |
org.apache.spark.api.java.JavaRDD<StreamEvent> |
fromStream(String streamName,
long startTime,
long endTime)
Creates a
JavaRDD that represents events from the given stream in the given time range. |
<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> |
fromStream(String streamName,
long startTime,
long endTime,
Class<? extends co.cask.cdap.api.stream.StreamEventDecoder<K,V>> decoderClass,
Class<K> keyType,
Class<V> valueType)
Creates a
JavaPairRDD that represents events from the given stream in the given time range. |
<V> org.apache.spark.api.java.JavaPairRDD<Long,V> |
fromStream(String streamName,
long startTime,
long endTime,
Class<V> valueType)
Creates a
JavaPairRDD that represents events from the given stream in the given time range. |
long |
getLogicalStartTime()
Returns the logical start time of the Batch Job.
|
PluginContext |
getPluginContext()
Returns a
Serializable PluginContext which can be used to request for plugins instances. |
Map<String,String> |
getRuntimeArguments()
Returns runtime arguments of the Batch Job.
|
org.apache.spark.api.java.JavaSparkContext |
getSparkContext()
Returns the
JavaSparkContext used during the execution. |
<K,V> void |
saveAsDataset(org.apache.spark.api.java.JavaPairRDD<K,V> rdd,
String datasetName)
Saves the given
JavaPairRDD to the given Dataset. |
<K,V> void |
saveAsDataset(org.apache.spark.api.java.JavaPairRDD<K,V> rdd,
String datasetName,
Map<String,String> arguments)
Saves the given
JavaPairRDD to the given Dataset with the given set of Dataset arguments. |
discardDataset, getDataset, getDataset, getDataset, getDataset, releaseDatasetgetArguments, getInputSchema, getInputSchemas, getMetrics, getNamespace, getOutputPortSchemas, getOutputSchema, getPipelineName, getPluginProperties, getPluginProperties, getStageName, loadPluginClass, newPluginInstancegetServiceURL, getServiceURLprovidelong getLogicalStartTime()
getLogicalStartTime in interface StageContextMap<String,String> getRuntimeArguments()
<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> fromDataset(String datasetName)
JavaPairRDD from the given Dataset.K - key typeV - value typedatasetName - name of the DatasetJavaPairRDD instance that reads from the given DatasetDatasetInstantiationException - if the Dataset doesn't exist<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> fromDataset(String datasetName, Map<String,String> arguments)
JavaPairRDD from the given Dataset with the given set of Dataset arguments.K - key typeV - value typedatasetName - name of the Datasetarguments - arguments for the DatasetJavaPairRDD instance that reads from the given DatasetDatasetInstantiationException - if the Dataset doesn't exist<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> fromDataset(String datasetName, Map<String,String> arguments, @Nullable Iterable<? extends Split> splits)
JavaPairRDD from the given Dataset with the given set of Dataset arguments
and custom list of Splits. Each Split will create a Partition in the JavaPairRDD.K - key typeV - value typedatasetName - name of the Datasetarguments - arguments for the Datasetsplits - list of Split or null to use the default splits provided by the DatasetJavaPairRDD instance that reads from the given DatasetDatasetInstantiationException - if the Dataset doesn't existorg.apache.spark.api.java.JavaRDD<StreamEvent> fromStream(String streamName)
JavaRDD that represents all events from the given stream.streamName - name of the streamJavaRDD instance that reads from the given streamDatasetInstantiationException - if the Stream doesn't existorg.apache.spark.api.java.JavaRDD<StreamEvent> fromStream(String streamName, long startTime, long endTime)
JavaRDD that represents events from the given stream in the given time range.streamName - name of the streamstartTime - the starting time of the stream to be read in milliseconds (inclusive)endTime - the ending time of the streams to be read in milliseconds (exclusive)JavaRDD instance that reads from the given streamDatasetInstantiationException - if the Stream doesn't exist<V> org.apache.spark.api.java.JavaPairRDD<Long,V> fromStream(String streamName, Class<V> valueType)
JavaPairRDD that represents all events from the given stream. The key in the
resulting JavaPairRDD is the event timestamp. The stream body will
be decoded as the give value type. Currently it supports Text, String and ByteWritable.streamName - name of the streamvalueType - type of the stream body to decode toJavaRDD instance that reads from the given streamDatasetInstantiationException - if the Stream doesn't exist<V> org.apache.spark.api.java.JavaPairRDD<Long,V> fromStream(String streamName, long startTime, long endTime, Class<V> valueType)
JavaPairRDD that represents events from the given stream in the given time range.
The key in the resulting JavaPairRDD is the event timestamp.
The stream body will be decoded as the give value type.
Currently it supports Text, String and ByteWritable.streamName - name of the streamstartTime - the starting time of the stream to be read in milliseconds (inclusive)endTime - the ending time of the streams to be read in milliseconds (exclusive)valueType - type of the stream body to decode toJavaRDD instance that reads from the given streamDatasetInstantiationException - if the Stream doesn't exist<K,V> org.apache.spark.api.java.JavaPairRDD<K,V> fromStream(String streamName, long startTime, long endTime, Class<? extends co.cask.cdap.api.stream.StreamEventDecoder<K,V>> decoderClass, Class<K> keyType, Class<V> valueType)
JavaPairRDD that represents events from the given stream in the given time range.
Each steam event will be decoded by an instance of the given StreamEventDecoder class.streamName - name of the streamstartTime - the starting time of the stream to be read in milliseconds (inclusive)endTime - the ending time of the streams to be read in milliseconds (exclusive)decoderClass - the StreamEventDecoder for decoding StreamEventkeyType - the type of the decoded keyvalueType - the type of the decoded valueJavaRDD instance that reads from the given streamDatasetInstantiationException - if the Stream doesn't exist<K,V> void saveAsDataset(org.apache.spark.api.java.JavaPairRDD<K,V> rdd,
String datasetName)
JavaPairRDD to the given Dataset.rdd - the JavaPairRDD to be saveddatasetName - name of the DatasetDatasetInstantiationException - if the Dataset doesn't exist<K,V> void saveAsDataset(org.apache.spark.api.java.JavaPairRDD<K,V> rdd,
String datasetName,
Map<String,String> arguments)
JavaPairRDD to the given Dataset with the given set of Dataset arguments.rdd - the JavaPairRDD to be saveddatasetName - name of the Datasetarguments - arguments for the DatasetDatasetInstantiationException - if the Dataset doesn't existorg.apache.spark.api.java.JavaSparkContext getSparkContext()
JavaSparkContext used during the execution.PluginContext getPluginContext()
Serializable PluginContext which can be used to request for plugins instances. The
instance returned can also be used in Spark program's closures.Serializable PluginContext.SparkInterpreter createSparkInterpreter() throws IOException
SparkInterpreter for Scala code compilation and interpretation.SparkInterpreterIOException - if failed to create a local directory for storing the compiled class filesCopyright © 2017 Cask Data, Inc. Licensed under the Apache License, Version 2.0.