Topic 1 Question 87

AWS Certified Data Engineer - Associate

Topic 1 Question 87
A retail company uses Amazon Aurora PostgreSQL to process and store live transactional data. The company uses an Amazon Redshift cluster for a data warehouse.

An extract, transform, and load (ETL) job runs every morning to update the Redshift cluster with new data from the PostgreSQL database. The company has grown rapidly and needs to cost optimize the Redshift cluster.

A data engineer needs to create a solution to archive historical data. The data engineer must be able to run analytics queries that effectively combine data from live transactional data in PostgreSQL, current data in Redshift, and archived historical data. The solution must keep only the most recent 15 months of data in Amazon Redshift to reduce costs.

Which combination of steps will meet these requirements?
- Configure the Amazon Redshift Federated Query feature to query live transactional data that is in the PostgreSQL database.
- Configure Amazon Redshift Spectrum to query live transactional data that is in the PostgreSQL database.
- Schedule a monthly job to copy data that is older than 15 months to Amazon S3 by using the UNLOAD command. Delete the old data from the Redshift cluster. Configure Amazon Redshift Spectrum to access historical data in Amazon S3.
- Schedule a monthly job to copy data that is older than 15 months to Amazon S3 Glacier Flexible Retrieval by using the UNLOAD command. Delete the old data from the Redshift cluster. Configure Redshift Spectrum to access historical data from S3 Glacier Flexible Retrieval.
- Create a materialized view in Amazon Redshift that combines live, current, and historical data from different sources.
ユーザの投票
コメント(9)
- Option A (A): Configuring Amazon Redshift Federated Query allows Redshift to directly query the live transactional data in the PostgreSQL database without needing to import it. This ensures that you can access the most recent live data efficiently.
  
  Option C (C): Scheduling a monthly job to copy data older than 15 months to Amazon S3 and then using Amazon Redshift Spectrum to access this historical data provides a cost-effective way to manage storage. This ensures that only the most recent 15 months of data are kept in Amazon Redshift, reducing storage costs. The historical data is still accessible via Redshift Spectrum for analytics queries.
  
  👍 7
  lalitjhawar2024/06/14
- 正解だと思う選択肢: A
  Choice A ensures that live transactional data from PostgreSQL can be accessed directly within Redshift queries.
  
  Choice C archives historical data in Amazon S3, reducing storage costs in Redshift while still making the data accessible via Redshift Spectrum.
  
  (to Admin: I can't select multiple answers on the voting comment)
  
  👍 4
  tgv2024/06/15
- 正解だと思う選択肢: A
  AC is correct. D is not correct, because Redshift Spectrum cannot read from S3 Glacier Flexible Retrieval.
  
  👍 4
  artworkad2024/06/15
シャッフルモード

ユーザの投票

コメント(9)