Topic 1 Question 141

AWS Certified Machine Learning - Specialty

Topic 1 Question 141
A retail company wants to combine its customer orders with the product description data from its product catalog. The structure and format of the records in each dataset is different. A data analyst tried to use a spreadsheet to combine the datasets, but the effort resulted in duplicate records and records that were not properly combined. The company needs a solution that it can use to combine similar records from the two datasets and remove any duplicates. Which solution will meet these requirements?
- Use an AWS Lambda function to process the data. Use two arrays to compare equal strings in the fields from the two datasets and remove any duplicates.
- Create AWS Glue crawlers for reading and populating the AWS Glue Data Catalog. Call the AWS Glue SearchTables API operation to perform a fuzzy- matching search on the two datasets, and cleanse the data accordingly.
- Create AWS Glue crawlers for reading and populating the AWS Glue Data Catalog. Use the FindMatches transform to cleanse the data.
- Create an AWS Lake Formation custom transform. Run a transformation for matching products from the Lake Formation console to cleanse the data automatically.
解説
Reference: https://aws.amazon.com/lake-formation/features/

ユーザの投票
コメント(5)
- 正解だと思う選択肢: C
  C; Glue can use FindMatches transformation to find duplicates
  
  👍 19
  spaceexplorer2022/04/30
- 正解だと思う選択肢: C
  It is C as described in the tutorial - https://docs.aws.amazon.com/glue/latest/dg/machine-learning-transform-tutorial.html
  
  LakeFormation can also invoke a FindMatches algorithm (because it manages Data Ingestion through Glue), but we don't have a data lake in this example. No one would build a whole Data Lake - a process that takes days - only to find some matching records.
  
  👍 4
  uninit2023/01/28
- D is correct
  
  👍 2
  [Removed]2022/06/13
シャッフルモード

解説

ユーザの投票

コメント(5)