Topic 1 Question 10
Case study - An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3. The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data. Which AWS service or feature can aggregate the data from the various data sources?
Amazon EMR Spark jobs
Amazon Kinesis Data Streams
Amazon DynamoDB
AWS Lake Formation
ユーザの投票
コメント(17)
- 正解だと思う選択肢: A
Amazon EMR with Spark is an excellent choice for aggregating, processing, and transforming large datasets from multiple sources (e.g., Amazon S3 and on-premises MySQL database). Spark jobs can handle both structured and unstructured. While Lake Formation is great for managing data lakes, it doesn’t provide the ETL and data processing capabilities required to aggregate and transform datasets from multiple sources.
👍 10tigrex732024/11/27 - 正解だと思う選択肢: D
Is it D? AWS Lake Formation ? EMR Spark jobs is more manual.
👍 6a4002bd2024/11/26 - 正解だと思う選択肢: D
Yet another poorly worded AWS certification question. Here is my reasoning, the question is about "aggregate the data from S3 and on-premise mysql" and I do intend "aggregate" as put in the same place, therefore: A. No, while EMR spark job can connect to S3 and MySQL (spark can connect to mysql database), but it is a better tool to process data and then sore them in S3 B. No, KDS it is for delivering streaming data sources to specific destinations (S3, OpenSearch ...) C. No, DynamoDB is a nosql db that is not a great fit here D. Yes, Lake Formation "combine different types of structured and unstructured data into a centralized repository" https://docs.aws.amazon.com/lake-formation/latest/dg/what-is-lake-formation.html and "with Lake Formation, you can import your data using workflows" and as it is based on AWS Glue it supports both S3 and mysql
👍 4ninomfr642024/12/29
シャッフルモード