Examtopics

Associate Data Practitioner
  • Topic 1 Question 24

    Your organization has a petabyte of application logs stored as Parquet files in Cloud Storage. You need to quickly perform a one-time SQL-based analysis of the files and join them to data that already resides in BigQuery. What should you do?

    • Create a Dataproc cluster, and write a PySpark job to join the data from BigQuery to the files in Cloud Storage.

    • Launch a Cloud Data Fusion environment, use plugins to connect to BigQuery and Cloud Storage, and use the SQL join operation to analyze the data.

    • Create external tables over the files in Cloud Storage, and perform SQL joins to tables in BigQuery to analyze the data.

    • Use the bq load command to load the Parquet files into BigQuery, and perform SQL joins to analyze the data.


    シャッフルモード