Topic 1 Question 41
Your organization needs to implement near real-time analytics for thousands of events arriving each second in Pub/Sub. The incoming messages require transformations. You need to configure a pipeline that processes, transforms, and loads the data into BigQuery while minimizing development time. What should you do?
Use a Google-provided Dataflow template to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
Create a Cloud Data Fusion instance and configure Pub/Sub as a source. Use Data Fusion to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
Load the data from Pub/Sub into Cloud Storage using a Cloud Storage subscription. Create a Dataproc cluster, use PySpark to perform transformations in Cloud Storage, and write the results to BigQuery.
Use Cloud Run functions to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.
ユーザの投票
コメント(2)
- 正解だと思う選択肢: A
The best solution to minimize development time while implementing near real-time analytics for high-volume Pub/Sub events with transformations and BigQuery loading is A. Use a Google-provided Dataflow template. Dataflow templates offer pre-built, optimized pipelines, drastically reducing development effort and time. Option B, Cloud Data Fusion, is a good visual alternative, but might require slightly more initial setup than deploying a template. Option C, Dataproc and PySpark, is significantly more complex and time-consuming. Option D, Cloud Run functions, while serverless, can become less manageable and more development effort for complex, high-volume streaming pipelines compared to dedicated pipeline services. Therefore, Option A is the most efficient and fastest path to implementation.
👍 1n21837128472025/02/27 - 正解だと思う選択肢: A
dataflow is real-time
👍 1n21837128472025/03/08
シャッフルモード