- All Implemented Interfaces:
- Serializable, org.apache.beam.sdk.transforms.display.HasDisplayData
- Enclosing class:
- TFIDF
public static class TFIDF.ComputeTfIdf
extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<URI,String>>,org.apache.beam.sdk.values.PCollection<org.apache.beam.sdk.values.KV<String,org.apache.beam.sdk.values.KV<URI,Double>>>>
A transform containing a basic TF-IDF pipeline. The input consists of KV objects
where the key is the document's URI and the value is a piece
of the document's content. The output is mapping from terms to
scores for each document URI.
- See Also:
- Serialized Form