Examtopics

Professional Data Engineer

295
297

Topic 1 Question 296
Your infrastructure team has set up an interconnect link between Google Cloud and the on-premises network. You are designing a high-throughput streaming pipeline to ingest data in streaming from an Apache Kafka cluster hosted on- premises. You want to store the data in BigQuery, with as minimal latency as possible. What should you do?
- Setup a Kafka Connect bridge between Kafka and Pub/Sub. Use a Google-provided Dataflow template to read the data from Pub/Sub, and write the data to BigQuery.
- Use a proxy host in the VPC in Google Cloud connecting to Kafka. Write a Dataflow pipeline, read data from the proxy host, and write the data to BigQuery.
- Use Dataflow, write a pipeline that reads the data from Kafka, and writes the data to BigQuery.
- Setup a Kafka Connect bridge between Kafka and Pub/Sub. Write a Dataflow pipeline, read the data from Pub/Sub, and write the data to BigQuery.
ユーザの投票
コメント(2)
- 正解だと思う選択肢: C
  Dataflow has templates to read from Kafka. Other options are too complicated https://cloud.google.com/dataflow/docs/kafka-dataflow
  
  👍 1
  rahulvin2023/12/30
- 正解だと思う選択肢: C
  C. Use Dataflow, write a pipeline that reads the data from Kafka, and writes the data to BigQuery.
  
  👍 1
  scaenruy2024/01/04
シャッフルモード

295
297