Topic 1 Question 31
You are a data analyst at your organization. You have been given a BigQuery dataset that includes customer information. The dataset contains inconsistencies and errors, such as missing values, duplicates, and formatting issues. You need to effectively and quickly clean the data. What should you do?
Develop a Dataflow pipeline to read the data from BigQuery, perform data quality rules and transformations, and write the cleaned data back to BigQuery.
Use Cloud Data Fusion to create a data pipeline to read the data from BigQuery, perform data quality transformations, and write the clean data back to BigQuery.
Export the data from BigQuery to CSV files. Resolve the errors using a spreadsheet editor, and re-import the cleaned data into BigQuery.
Use BigQuery's built-in functions to perform data quality transformations.
ユーザの投票
コメント(2)
- 正解だと思う選択肢: D
The best solution for effective and quick data cleaning is D. Use BigQuery's built-in functions. This is the most efficient and quickest approach as it leverages the power of BigQuery SQL for data transformations directly within the BigQuery environment. Option B (Cloud Data Fusion) is a good visual alternative but slower to set up than direct SQL. Option A (Dataflow) is powerful but more complex and time-consuming for initial cleaning. Option C (Spreadsheet Editor) is manual, inefficient, and not scalable for millions of records. Therefore, Option D offers the optimal balance of effectiveness and speed for cleaning data within BigQuery.
👍 1n21837128472025/02/27 - 正解だと思う選択肢: D
it's already in bigquery, so just preform the transformation in the dataset
👍 1n21837128472025/03/08
シャッフルモード