Examtopics

Professional Data Engineer
  • Topic 1 Question 286

    You have thousands of Apache Spark jobs running in your on-premises Apache Hadoop cluster. You want to migrate the jobs to Google Cloud. You want to use managed services to run your jobs instead of maintaining a long-lived Hadoop cluster yourself. You have a tight timeline and want to keep code changes to a minimum. What should you do?

    • Move your data to BigQuery. Convert your Spark scripts to a SQL-based processing approach.

    • Rewrite your jobs in Apache Beam. Run your jobs in Dataflow.

    • Copy your data to Compute Engine disks. Manage and run your jobs directly on those instances.

    • Move your data to Cloud Storage. Run your jobs on Dataproc.


    シャッフルモード