Topic 1 Question 235
A company is training machine learning (ML) models on Amazon SageMaker by using 200 TB of data that is stored in Amazon S3 buckets. The training data consists of individual files that are each larger than 200 MB in size. The company needs a data access solution that offers the shortest processing time and the least amount of setup.
Which solution will meet these requirements?
Use File mode in SageMaker to copy the dataset from the S3 buckets to the ML instance storage.
Create an Amazon FSx for Lustre file system. Link the file system to the S3 buckets.
Create an Amazon Elastic File System (Amazon EFS) file system. Mount the file system to the training instances.
Use FastFile mode in SageMaker to stream the files on demand from the S3 buckets.
ユーザの投票
コメント(8)
- 正解だと思う選択肢: D
When to use fast file mode:
For larger datasets with larger files (more than 50 MB per file), the first option is to try fast file mode, which is more straightforward to use than FSx for Lustre because it doesn't require creating a file system, or connecting to a VPC. Fast file mode is ideal for large file containers (more than 150 MB), and might also do well with files more than 50 MB.
👍 7mawsman2023/04/17 - 正解だと思う選択肢: B
The solution that meets the requirements of the company is B, which involves creating an Amazon FSx for Lustre file system and linking it to the S3 buckets. Amazon FSx for Lustre is a fully managed, high-performance file system optimized for compute-intensive workloads, such as machine learning training. It is designed to provide low latencies and high throughput for processing large data sets, and it can directly access data from S3 buckets without any data movement or copying. This solution requires minimal setup and provides the shortest processing time since the data can be accessed in parallel by multiple instances.
👍 4oso03482023/03/19 - 正解だと思う選択肢: B👍 3sevosevo2023/03/18
シャッフルモード