Topic 1 Question 28

AWS Certified Data Engineer - Associate

Topic 1 Question 28
A company uses an Amazon QuickSight dashboard to monitor usage of one of the company's applications. The company uses AWS Glue jobs to process data for the dashboard. The company stores the data in a single Amazon S3 bucket. The company adds new data every day. A data engineer discovers that dashboard queries are becoming slower over time. The data engineer determines that the root cause of the slowing queries is long-running AWS Glue jobs. Which actions should the data engineer take to improve the performance of the AWS Glue jobs?

2 つ選択
- Partition the data that is in the S3 bucket. Organize the data by year, month, and day.
- Increase the AWS Glue instance size by scaling up the worker type.
- Convert the AWS Glue schema to the DynamicFrame schema class.
- Adjust AWS Glue job scheduling frequency so the jobs run half as many times each day.
- Modify the IAM role that grants access to AWS glue to grant access to all S3 features.
ユーザの投票
コメント(2)
- 正解だと思う選択肢: AB
  A. Partition the data that is in the S3 bucket. Organize the data by year, month, and day.
  
  • Partitioning data in Amazon S3 can significantly improve query performance. By organizing the data by year, month, and day, AWS Glue and Amazon QuickSight can scan only the relevant partitions of data, which reduces the amount of data read and processed. This approach is particularly effective for time-series data, where queries often target specific time ranges.
  
  B. Increase the AWS Glue instance size by scaling up the worker type.
  
  • Scaling up the worker type can provide more computational resources to the AWS Glue jobs, enabling them to process data faster. This can be especially beneficial when dealing with large datasets or complex transformations. It’s important to monitor the performance improvements and cost implications of scaling up.
  👍 10
  rralucard_2024/08/03
- Partition the Data in Amazon S3:
  
  AWS documentation on optimizing Amazon S3 performance: https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html
  
  AWS Glue documentation on partitioning data for AWS Glue jobs: https://docs.aws.amazon.com/glue/latest/dg/how-it-works.html#how-partitioning-works
  
  Best practices for partitioning in Amazon S3: https://docs.aws.amazon.com/AmazonS3/latest/userguide/best-practices-partitioning.html
  
  Optimizing AWS Glue Job Settings:
  
  AWS Glue documentation on optimizing job performance: https://docs.aws.amazon.com/glue/latest/dg/best-practices.html
  
  AWS Glue documentation on scaling AWS Glue job resources: https://docs.aws.amazon.com/glue/latest/dg/monitor-profile-glue-job-cloudwatch-metrics.html
  
  By referring to these documentation resources, the data engineer can gain insights into best practices and recommendations provided by AWS for optimizing AWS Glue jobs, thereby justifying the suggested actions to address the issue of slowing job performance.
  👍 2
  certplan2024/09/20
シャッフルモード

ユーザの投票

コメント(2)