Examtopics

AWS Certified Data Engineer - Associate
  • Topic 1 Question 118

    A banking company uses an application to collect large volumes of transactional data. The company uses Amazon Kinesis Data Streams for real-time analytics. The company’s application uses the PutRecord action to send data to Kinesis Data Streams.

    A data engineer has observed network outages during certain times of day. The data engineer wants to configure exactly-once delivery for the entire processing pipeline.

    Which solution will meet this requirement?

    • Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source.

    • Update the checkpoint configuration of the Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) data collection application to avoid duplicate processing of events.

    • Design the data source so events are not ingested into Kinesis Data Streams multiple times.

    • Stop using Kinesis Data Streams. Use Amazon EMR instead. Use Apache Flink and Apache Spark Streaming in Amazon EMR.


    シャッフルモード