Topic 6 Question 4
For this question, refer to the TerramEarth case study. A new architecture that writes all incoming data to BigQuery has been introduced. You notice that the data is dirty, and want to ensure data quality on an automated daily basis while managing cost. What should you do?
Set up a streaming Cloud Dataflow job, receiving data by the ingestion process. Clean the data in a Cloud Dataflow pipeline.
Create a Cloud Function that reads data from BigQuery and cleans it. Trigger the Cloud Function from a Compute Engine instance.
Create a SQL statement on the data in BigQuery, and save it as a view. Run the view daily, and save the result to a new table.
Use Cloud Dataprep and configure the BigQuery tables as the source. Schedule a daily job to clean the data.
コメント(2)
D is the answer. question states 'to clean' data, only tool that can clean data is dataprep, all other tools in the answer is wrong
Dataprep by Trifacta is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis, reporting, and machine learning. Because Dataprep is serverless and works at any scale, there is no infrastructure to deploy or manage. Your next ideal data transformation is suggested and predicted with each UI input, so you don’t have to write code.
👍 4areza2021/06/11Answer D:
Key requirements "automated daily basis while managing cost"
👍 2Yogikant2021/06/04
シャッフルモード