Examtopics

AWS Certified Data Engineer - Associate
  • Topic 1 Question 98

    A company wants to use machine learning (ML) to perform analytics on data that is in an Amazon S3 data lake. The company has two data transformation requirements that will give consumers within the company the ability to create reports.

    The company must perform daily transformations on 300 GB of data that is in a variety format that must arrive in Amazon S3 at a scheduled time. The company must perform one-time transformations of terabytes of archived data that is in the S3 data lake. The company uses Amazon Managed Workflows for Apache Airflow (Amazon MWAA) Directed Acyclic Graphs (DAGs) to orchestrate processing.

    Which combination of tasks should the company schedule in the Amazon MWAA DAGs to meet these requirements MOST cost-effectively?

    2 つ選択
    • For daily incoming data, use AWS Glue crawlers to scan and identify the schema.

    • For daily incoming data, use Amazon Athena to scan and identify the schema.

    • For daily incoming data, use Amazon Redshift to perform transformations.

    • For daily and archived data, use Amazon EMR to perform data transformations.

    • For archived data, use Amazon SageMaker to perform data transformations.


    シャッフルモード