Examtopics

Professional Data Engineer

136
138

Topic 1 Question 137
You have a data pipeline with a Dataflow job that aggregates and writes time series metrics to Bigtable. You notice that data is slow to update in Bigtable. This data feeds a dashboard used by thousands of users across the organization. You need to support additional concurrent users and reduce the amount of time required to write the data. Which two actions should you take?

2 つ選択
- Configure your Dataflow pipeline to use local execution
- Increase the maximum number of Dataflow workers by setting maxNumWorkers in PipelineOptions
- Increase the number of nodes in the Bigtable cluster
- Modify your Dataflow pipeline to use the Flatten transform before writing to Bigtable
- Modify your Dataflow pipeline to use the CoGroupByKey transform before writing to Bigtable
解説
Reference: https://cloud.google.com/bigtable/docs/performance#performance-write-throughput https://cloud.google.com/dataflow/docs/guides/specifying-exec-params#setting-other-cloud-pipeline-options

ユーザの投票
コメント(8)
- 正解だと思う選択肢: BC
  It should be B and C
  
  👍 7
  arpitagrawal2022/09/06
- 正解だと思う選択肢: BC
  BC is correct
  
  Why the comments is deleted?
  
  👍 6
  ducc2022/09/02
- 正解だと思う選択肢: BC
  Increase max num of workers increases pipeline performance in Dataflow Increase number of nodes in Bigtable increases write throughput
  
  👍 2
  ovokpus2022/11/22
シャッフルモード

136
138