Topic 1 Question 30

Associate Data Practitioner

Topic 1 Question 30
You are predicting customer churn for a subscription-based service. You have a 50 PB historical customer dataset in BigQuery that includes demographics, subscription information, and engagement metrics. You want to build a churn prediction model with minimal overhead. You want to follow the Google-recommended approach. What should you do?
- Export the data from BigQuery to a local machine. Use scikit-learn in a Jupyter notebook to build the churn prediction model.
- Use Dataproc to create a Spark cluster. Use the Spark MLlib within the cluster to build the churn prediction model.
- Create a Looker dashboard that is connected to BigQuery. Use LookML to predict churn.
- Use the BigQuery Python client library in a Jupyter notebook to query and preprocess the data in BigQuery. Use the CREATE MODEL statement in BigQueryML to train the churn prediction model.
ユーザの投票
コメント(1)
- 正解だと思う選択肢: D
  The best and Google-recommended solution for building a churn model on a 50 PB BigQuery dataset with minimal overhead is D. Use BigQuery Python client and BigQueryML. BigQueryML enables in-database model training, eliminating data movement and minimizing overhead. This aligns with Google's best practices for BigQuery data. Option A (Local scikit-learn) is impractical due to the dataset size. Option B (Dataproc/Spark) introduces unnecessary data movement and cluster management overhead. Option C (Looker) is for BI, not ML model development. Therefore, Option D is the optimal choice for efficiency, scalability, and adherence to Google's recommendations for BigQuery-based machine learning.
  
  👍 1
  n21837128472025/02/27
シャッフルモード

ユーザの投票

コメント(1)