Topic 1 Question 151
You work for an advertising company, and you've developed a Spark ML model to predict click-through rates at advertisement blocks. You've been developing everything at your on-premises data center, and now your company is migrating to Google Cloud. Your data center will be closing soon, so a rapid lift-and-shift migration is necessary. However, the data you've been using will be migrated to migrated to BigQuery. You periodically retrain your Spark ML models, so you need to migrate existing training pipelines to Google Cloud. What should you do?
Use Vertex AI for training existing Spark ML models
Rewrite your models on TensorFlow, and start using Vertex AI
Use Dataproc for training existing Spark ML models, but start reading data directly from BigQuery
Spin up a Spark cluster on Compute Engine, and train Spark ML models on the data exported from BigQuery
ユーザの投票
コメント(11)
- 正解だと思う選択肢: C
C is the answer.
https://cloud.google.com/dataproc/docs/concepts/overview Dataproc is a managed Spark and Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning. Dataproc automation helps you create clusters quickly, manage them easily, and save money by turning clusters off when you don't need them. With less time and money spent on administration, you can focus on your jobs and your data.
Integrated — Dataproc has built-in integration with other Google Cloud Platform services, such as BigQuery, Cloud Storage, Cloud Bigtable, Cloud Logging, and Cloud Monitoring, so you have more than just a Spark or Hadoop cluster—you have a complete data platform. For example, you can use Dataproc to effortlessly ETL terabytes of raw log data directly into BigQuery for business reporting.
👍 3zellck2022/12/01 - 正解だと思う選択肢: A
The new answer is Vertix AI which has the feature run spark ML workloads.
👍 3Prudvi32662023/04/22 - 正解だと思う選択肢: A
the question is: is it faster to move a SparkML job to a Vertex AI or to Dataproc? I am personally not sure, I would go for Dataproc as notebooks are not mentioned, but reading the Google article:
"Dataproc Serverless components for Vertex AI Pipelines that further simplify MLOps for Spark, Spark SQL, PySpark and Spark jobs."
👍 3vaga12023/05/10
シャッフルモード