public class GameStats extends LeaderBoard
UserScore, HourlyTeamScore, and LeaderBoard.
New concepts: session windows and finding session duration; use of both
singleton and non-singleton side inputs.
This pipeline builds on the LeaderBoard functionality, and adds some "business
intelligence" analysis: abuse detection and usage patterns. The pipeline derives the Mean user
score sum for a window, and uses that information to identify likely spammers/robots. (The robots
have a higher click rate than the human users). The 'robot' users are then filtered out when
calculating the team scores.
Additionally, user sessions are tracked: that is, we find bursts of user activity using session windows. Then, the mean session duration information is recorded in the context of subsequent fixed windowing. (This could be used to tell us what games are giving us greater user retention).
Run org.apache.beam.examples.complete.game.injector.Injector to generate
pubsub data for this pipeline. The Injector documentation provides more detail.
To execute this pipeline using the Dataflow service, specify the pipeline configuration like this:
--project=YOUR_PROJECT_ID
--tempLocation=gs://YOUR_TEMP_DIRECTORY
--runner=BlockingDataflowRunner
--dataset=YOUR-DATASET
--topic=projects/YOUR-PROJECT/topics/YOUR-TOPIC
where the BigQuery dataset you specify must already exist. The PubSub topic you specify should
be the same topic to which the Injector is publishing.| Modifier and Type | Class and Description |
|---|---|
static class |
GameStats.CalculateSpammyUsers
Filter out all but those users with a high clickrate, which we will consider as 'spammy' uesrs.
|
UserScore.ExtractAndSumScore| Constructor and Description |
|---|
GameStats() |
| Modifier and Type | Method and Description |
|---|---|
protected static Map<String,WriteToBigQuery.FieldInfo<Double>> |
configureSessionWindowWrite()
Create a map of information that describes how to write pipeline output to BigQuery.
|
protected static Map<String,WriteToBigQuery.FieldInfo<KV<String,Integer>>> |
configureWindowedWrite()
Create a map of information that describes how to write pipeline output to BigQuery.
|
static void |
main(String[] args) |
configureGlobalWindowBigQueryWrite, configureWindowedTableWriteconfigureBigQueryWriteprotected static Map<String,WriteToBigQuery.FieldInfo<KV<String,Integer>>> configureWindowedWrite()
protected static Map<String,WriteToBigQuery.FieldInfo<Double>> configureSessionWindowWrite()
Copyright © 2016 The Apache Software Foundation. All rights reserved.