public class LeaderBoard extends HourlyTeamScore
UserScore and HourlyTeamScore. Concepts include: processing unbounded
data using fixed windows; use of custom timestamps and event-time processing; generation of
early/speculative results; using .accumulatingFiredPanes() to do cumulative processing of late-
arriving data.
This pipeline processes an unbounded stream of 'game events'. The calculation of the team scores uses fixed windowing based on event time (the time of the game play event), not processing time (the time that an event is processed by the pipeline). The pipeline calculates the sum of scores per team, for each window. By default, the team scores are calculated using one-hour windows.
In contrast-- to demo another windowing option-- the user scores are calculated using a global window, which periodically (every ten minutes) emits cumulative user score sums.
In contrast to the previous pipelines in the series, which used static, finite input data, here we're using an unbounded data source, which lets us provide speculative results, and allows handling of late data, at much lower latency. We can use the early/speculative results to keep a 'leaderboard' updated in near-realtime. Our handling of late data lets us generate correct results, e.g. for 'team prizes'. We're now outputing window results as they're calculated, giving us much lower latency than with the previous batch examples.
Run Injector to generate pubsub data for this pipeline. The Injector
documentation provides more detail on how to do this.
To execute this pipeline using the Dataflow service, specify the pipeline configuration like this:
--project=YOUR_PROJECT_ID
--tempLocation=gs://YOUR_TEMP_DIRECTORY
--runner=BlockingDataflowRunner
--dataset=YOUR-DATASET
--topic=projects/YOUR-PROJECT/topics/YOUR-TOPIC
where the BigQuery dataset you specify must already exist.
The PubSub topic you specify should be the same topic to which the Injector is publishing.UserScore.ExtractAndSumScore| Constructor and Description |
|---|
LeaderBoard() |
| Modifier and Type | Method and Description |
|---|---|
protected static Map<String,WriteToBigQuery.FieldInfo<KV<String,Integer>>> |
configureGlobalWindowBigQueryWrite()
Create a map of information that describes how to write pipeline output to BigQuery.
|
protected static Map<String,WriteToBigQuery.FieldInfo<KV<String,Integer>>> |
configureWindowedTableWrite()
Create a map of information that describes how to write pipeline output to BigQuery.
|
static void |
main(String[] args) |
configureBigQueryWriteprotected static Map<String,WriteToBigQuery.FieldInfo<KV<String,Integer>>> configureWindowedTableWrite()
protected static Map<String,WriteToBigQuery.FieldInfo<KV<String,Integer>>> configureGlobalWindowBigQueryWrite()
Copyright © 2016 The Apache Software Foundation. All rights reserved.