Topic 1 Question 181

Professional Data Engineer

Topic 1 Question 181
You need to give new website users a globally unique identifier (GUID) using a service that takes in data points and returns a GUID. This data is sourced from both internal and external systems via HTTP calls that you will make via microservices within your pipeline. There will be tens of thousands of messages per second and that can be multi-threaded. and you worry about the backpressure on the system. How should you design your pipeline to minimize that backpressure?
- Call out to the service via HTTP.
- Create the pipeline statically in the class definition.
- Create a new object in the startBundle method of DoFn.
- Batch the job into ten-second increments.
ユーザの投票
コメント(17)
- 正解だと思う選択肢: D
  D: I have insisted on this choice all aling. please read find the keyword massive backpressure https://cloud.google.com/blog/products/data-analytics/guide-to-common-cloud-dataflow-use-case-patterns-part-1
  
  if the call takes on average 1 sec, that would cause massive backpressure on the pipeline. In these circumstances you should consider batching these requests, instead.
  
  👍 11
  John_Pongthorn2022/09/29
- 正解だと思う選択肢: D
  D All guys ,pls read carefully on Pattern: Calling external services for data enrichment https://cloud.google.com/blog/products/data-analytics/guide-to-common-cloud-dataflow-use-case-patterns-part-1
  A , B , C all of them are solution for norma case but if you need to stand for backpressure, in last sector in Note : Note: When using this pattern, be sure to plan for the load that's placed on the external service and any associated backpressure. For example, imagine a pipeline that's processing tens of thousands of messages per second in steady state. If you made a callout per element, you would need the system to deal with the same number of API calls per second. Also, if the call takes on average 1 sec, that would cause massive backpressure on the pipeline. In these circumstances, you should consider batching these requests, instead.
  
  Anyone can share ideas to debate with me.
  
  👍 5
  John_Pongthorn2022/09/26
- 正解だと思う選択肢: D
  D is the answer.
  
  https://cloud.google.com/blog/products/data-analytics/guide-to-common-cloud-dataflow-use-case-patterns-part-1 For example, imagine a pipeline that's processing tens of thousands of messages per second in steady state. If you made a callout per element, you would need the system to deal with the same number of API calls per second. Also, if the call takes on average 1 sec, that would cause massive backpressure on the pipeline. In these circumstances you should consider batching these requests, instead.
  
  👍 3
  zellck2022/11/29
シャッフルモード

ユーザの投票

コメント(17)