public class TopWikipediaSessions extends Object
Concepts: Using Windowing to perform time-based aggregations of data.
It is not recommended to execute this pipeline locally, given the size of the default input data.
To execute this pipeline using a selected runner and an output prefix on GCS, specify:
--runner=YOUR_SELECTED_RUNNER
--output=gs://YOUR_OUTPUT_PREFIX
See examples/java/README.md for instructions about how to configure different runners.
The default input is gs://apache-beam-samples/wikipedia_edits/*.json and can be
overridden with --input.
The input for this example is large enough that it's a good place to enable (experimental) autoscaling:
--autoscalingAlgorithm=BASIC
--maxNumWorkers=20
This will automatically scale the number of workers up over time until the job completes.| Constructor and Description |
|---|
TopWikipediaSessions() |
public static void main(String[] args)