Topic 1 Question 54
A company uses Amazon SageMaker for its ML workloads. The company's ML engineer receives a 50 MB Apache Parquet data file to build a fraud detection model. The file includes several correlated columns that are not required. What should the ML engineer do to drop the unnecessary columns in the file with the LEAST effort?
Download the file to a local workstation. Perform one-hot encoding by using a custom Python script.
Create an Apache Spark job that uses a custom processing script on Amazon EMR.
Create a SageMaker processing job by calling the SageMaker Python SDK.
Create a data flow in SageMaker Data Wrangler. Configure a transform step.
ユーザの投票
コメント(2)
- 正解だと思う選択肢: D👍 2GiorgioGss2024/11/27
- 正解だと思う選択肢: D
Parquet data file → SageMaker Data Wrangler → Explore data → Transform → Drop unnecessary columns → Clean and preprocess data → Export to S3 → Fraud detection model
👍 1Saransundar2024/12/04
シャッフルモード