Topic 1 Question 63
You work for a large retailer and have been asked to segment your customers by their purchasing habits. The purchase history of all customers has been uploaded to BigQuery. You suspect that there may be several distinct customer segments, however you are unsure of how many, and you don’t yet understand the commonalities in their behavior. You want to find the most efficient solution. What should you do?
Create a k-means clustering model using BigQuery ML. Allow BigQuery to automatically optimize the number of clusters.
Create a new dataset in Dataprep that references your BigQuery table. Use Dataprep to identify similarities within each column.
Use the Data Labeling Service to label each customer record in BigQuery. Train a model on your labeled data using AutoML Tables. Review the evaluation metrics to understand whether there is an underlying pattern in the data.
Get a list of the customer segments from your company’s Marketing team. Use the Data Labeling Service to label each customer record in BigQuery according to the list. Analyze the distribution of labels in your dataset using Data Studio.
ユーザの投票
コメント(11)
Will go for 'A' as it is easy to build model in BQML where data is already present and optimization would be auto in case of K-mean algo
👍 4Vedjha2022/12/07- 正解だと思う選択肢: A
ans: A, pretty sure.
C, D => discarded, very time consuming. B => yes, you can identify similarities within each column, but when i read "you don’t yet understand the commonalities in their behavior" i understand that this job would be difficult, because there could be many columns to analyze, and i don't think that this would be efficient.
A => BigQuery ML is compatible with kmeans clustering, it's easy and efficient to create, and i would automatically detect the number of clusters.
Also from the BigQuery ML docs: "K-means clustering for data segmentation; for example, identifying customer segments." (Source: https://cloud.google.com/bigquery-ml/docs/introduction#supported_models_in)
👍 4wish00352022/12/15 - 正解だと思う選択肢: A
K-means is a good unsupervised learning algorithm to segment a population based on similarity
We can usa K-means directly in BQ, so I think it's "the most efficient way"
Labeling is not a good option since we don't really know what make a customer similar to another, and why dataprep if we can use directly BQ?
👍 3LearnSodas2022/12/15
シャッフルモード