Topic 1 Question 40
You are working on a data pipeline that will validate and clean incoming data before loading it into BigQuery for real-time analysis. You want to ensure that the data validation and cleaning is performed efficiently and can handle high volumes of data. What should you do?
Write custom scripts in Python to validate and clean the data outside of Google Cloud. Load the cleaned data into BigQuery.
Use Cloud Run functions to trigger data validation and cleaning routines when new data arrives in Cloud Storage.
Use Dataflow to create a streaming pipeline that includes validation and transformation steps.
Load the raw data into BigQuery using Cloud Storage as a staging area, and use SQL queries in BigQuery to validate and clean the data.
ユーザの投票
コメント(1)
- 正解だと思う選択肢: C
C. Dataflow is real-time
👍 1n21837128472025/02/27
シャッフルモード