Topic 1 Question 15
You need to store and analyze social media postings in Google BigQuery at a rate of 10,000 messages per minute in near real-time. Initially, design the application to use streaming inserts for individual postings. Your application also performs data aggregations right after the streaming inserts. You discover that the queries after streaming inserts do not exhibit strong consistency, and reports from the queries might miss in-flight data. How can you adjust your application design?
Re-write the application to load accumulated data every 2 minutes.
Convert the streaming insert code to batch load for individual messages.
Load the original message to Google Cloud SQL, and export the table every hour to BigQuery via streaming inserts.
Estimate the average latency for data availability after streaming inserts, and always run queries after waiting twice as long.
ユーザの投票
コメント(17)
Answer: D Description: The data is first comes to buffer and then written to Storage. If we are running queries in buffer we will face above mentioned issues. If we wait for the bigquery to write the data to storage then we won’t face the issue. So We need to wait till it’s written tio storage
👍 51[Removed]2020/03/26D - speaks about near real-time approach. None other.
👍 18[Removed]2020/03/16Answer: D What to learn or look for
- In-Flight data = (Real Time data, i.e still in streaming pipeline and not landed in BigQuery)
- Dataflow (assume in best case) streaming pipeline is running to send data to Bigquery. Why not option B: change streaming to batch upload is not business requirement, we have to stuck to streaming and real time analysis.
Option D: make bigquery run after waiting for sometime (twice here), How will you do it?
- there is not setting in BQ to do it, right!. So, adjust it in your pipeline (dataflow)
- For example, add Fixed window, and you want to execute aggregation query after 2 min. Code
.apply(Window.<TableRow>into(FixedWindows.of(Duration.standardMinutes(2)))) .apply(BigQueryIO.writeTableRows() .to("my_dataset.my_table")👍 2musumusu2023/02/23
シャッフルモード