Topic 1 Question 153

AWS Certified Data Engineer - Associate

Topic 1 Question 153
A retail company uses an Amazon Redshift data warehouse and an Amazon S3 bucket. The company ingests retail order data into the S3 bucket every day.

The company stores all order data at a single path within the S3 bucket. The data has more than 100 columns. The company ingests the order data from a third-party application that generates more than 30 files in CSV format every day. Each CSV file is between 50 and 70 MB in size.

The company uses Amazon Redshift Spectrum to run queries that select sets of columns. Users aggregate metrics based on daily orders. Recently, users have reported that the performance of the queries has degraded. A data engineer must resolve the performance issues for the queries.

Which combination of steps will meet this requirement with LEAST developmental effort?

2 つ選択
- Configure the third-party application to create the files in a columnar format.
- Develop an AWS Glue ETL job to convert the multiple daily CSV files to one file for each day.
- Partition the order data in the S3 bucket based on order date.
- Configure the third-party application to create the files in JSON format.
- Load the JSON data into the Amazon Redshift table in a SUPER type column.
ユーザの投票
コメント(3)
- 正解だと思う選択肢: AC
  https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html
  
  👍 1
  emupsx12024/11/24
- 正解だと思う選択肢: BC
  No, porque la opción A implica modificar la aplicación de terceros para que genere archivos en formato columnar, lo cual puede ser más complejo o inviable, mientras que la opción B utiliza un job de Glue para consolidar los CSV sin tocar la fuente. La opción C sigue siendo esencial para particionar por fecha y optimizar las consultas.
  
  👍 1
  italiancloud20252025/02/18
- 正解だと思う選択肢: AC
  using parqueet or ORC is efficient and so will be partitioning by order date so the range of data is lower
  
  👍 1
  Ell892025/02/26
シャッフルモード

ユーザの投票

コメント(3)