Topic 1 Question 50
A company stores time-series data about user clicks in an Amazon S3 bucket. The raw data consists of millions of rows of user activity every day. ML engineers access the data to develop their ML models. The ML engineers need to generate daily reports and analyze click trends over the past 3 days by using Amazon Athena. The company must retain the data for 30 days before archiving the data. Which solution will provide the HIGHEST performance for data retrieval?
Keep all the time-series data without partitioning in the S3 bucket. Manually move data that is older than 30 days to separate S3 buckets.
Create AWS Lambda functions to copy the time-series data into separate S3 buckets. Apply S3 Lifecycle policies to archive data that is older than 30 days to S3 Glacier Flexible Retrieval.
Organize the time-series data into partitions by date prefix in the S3 bucket. Apply S3 Lifecycle policies to archive partitions that are older than 30 days to S3 Glacier Flexible Retrieval.
Put each day's time-series data into its own S3 bucket. Use S3 Lifecycle policies to archive S3 buckets that hold data that is older than 30 days to S3 Glacier Flexible Retrieval.
ユーザの投票
コメント(2)
- 正解だと思う選択肢: C
This is more an S3 question than a ML one.
👍 1GiorgioGss2024/11/27 - 正解だと思う選択肢: C
Time-series data → Partition by date in S3 → Optimized Athena queries → S3 lifecycle policies → Move partitions >30 days to S3 Glacier Flexible Retrieval
👍 1Saransundar2024/12/05
シャッフルモード