Topic 1 Question 80
A data engineer needs to build an extract, transform, and load (ETL) job. The ETL job will process daily incoming .csv files that users upload to an Amazon S3 bucket. The size of each S3 object is less than 100 MB. Which solution will meet these requirements MOST cost-effectively?
Write a custom Python application. Host the application on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
Write a PySpark ETL script. Host the script on an Amazon EMR cluster.
Write an AWS Glue PySpark job. Use Apache Spark to transform the data.
Write an AWS Glue Python shell job. Use pandas to transform the data.
ユーザの投票
コメント(14)
- 正解だと思う選択肢: C
AWS Glue Python Shell Job is billed $0.44 per DPU-Hour for each job AWS Glue PySpark is billed $0.29 per DPU-Hour for each job with flexible execution and $0.44 per DPU-Hour for each job with standard execution Source: https://aws.amazon.com/glue/pricing/
👍 10halogi2024/03/27 - 正解だと思う選択肢: D
Option D: Write an AWS Glue Python shell job and use pandas to transform the data, is the most cost-effective solution for the described scenario.
AWS Glue’s Python shell jobs are a good fit for smaller-scale ETL tasks, especially when dealing with .csv files that are less than 100 MB each. The use of pandas, a powerful and efficient data manipulation library in Python, makes it an ideal tool for processing and transforming these types of files. This approach avoids the overhead and additional costs associated with more complex solutions like Amazon EKS or EMR, which are generally more suited for larger-scale, more complex data processing tasks.
Given the requirements – processing daily incoming small-sized .csv files – this solution provides the necessary functionality with minimal resources, aligning well with the goal of cost-effectiveness.
👍 8atu17892024/02/16 - 正解だと思う選択肢: D
good candidate to be (2 options) for real, either spark and py have similar approaches. I would go with Pandas, although... 50/50.. it could be Spark. I hope not to find this question in the exam
👍 7pypelyncar2024/06/11
シャッフルモード