Topic 1 Question 129
A company ingests machine learning (ML) data from web advertising clicks into an Amazon S3 data lake. Click data is added to an Amazon Kinesis data stream by using the Kinesis Producer Library (KPL). The data is loaded into the S3 data lake from the data stream by using an Amazon Kinesis Data Firehose delivery stream. As the data volume increases, an ML specialist notices that the rate of data ingested into Amazon S3 is relatively constant. There also is an increasing backlog of data for Kinesis Data Streams and Kinesis Data Firehose to ingest. Which next step is MOST likely to improve the data ingestion rate into Amazon S3?
Increase the number of S3 prefixes for the delivery stream to write to.
Decrease the retention period for the data stream.
Increase the number of shards for the data stream.
Add more consumers using the Kinesis Client Library (KCL).
ユーザの投票
コメント(13)
C is the correct answer. # of shard is determined by:
of transactions per second
times 2. data blob eg. 100 KB in size 3. One shard can Ingest 1 MB/second
👍 37SophieSu2021/09/26the answer should be A - the reason why shards are not the right answer is the lack of ProvisionedThroughputExceeded exceptions that occur when a KDS has too few shards. The scenario talks about a consistent pace of delivery into S3 and a rising backlog of data (which indicates KDS stream is still able to ingest data) in the stream, hence the S3 write limit per prefix is at fault:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html
👍 9dolorez2022/05/21This question is a piece of shard. I shardded trying to answer this question.
👍 6AddiWei2022/02/08
シャッフルモード