Examtopics

AWS Certified Data Engineer - Associate
  • Topic 1 Question 148

    An investment company needs to manage and extract insights from a volume of semi-structured data that grows continuously.

    A data engineer needs to deduplicate the semi-structured data, remove records that are duplicates, and remove common misspellings of duplicates.

    Which solution will meet these requirements with the LEAST operational overhead?

    • Use the FindMatches feature of AWS Glue to remove duplicate records.

    • Use non-Windows functions in Amazon Athena to remove duplicate records.

    • Use Amazon Neptune ML and an Apache Gremlin script to remove duplicate records.

    • Use the global tables feature of Amazon DynamoDB to prevent duplicate data.


    シャッフルモード