Topic 1 Question 284

Professional Data Engineer

Topic 1 Question 284
You have a network of 1000 sensors. The sensors generate time series data: one metric per sensor per second, along with a timestamp. You already have 1 TB of data, and expect the data to grow by 1 GB every day. You need to access this data in two ways. The first access pattern requires retrieving the metric from one specific sensor stored at a specific timestamp, with a median single-digit millisecond latency. The second access pattern requires running complex analytic queries on the data, including joins, once a day. How should you store this data?
- Store your data in BigQuery. Concatenate the sensor ID and timestamp, and use it as the primary key.
- Store your data in Bigtable. Concatenate the sensor ID and timestamp and use it as the row key. Perform an export to BigQuery every day.
- Store your data in Bigtable. Concatenate the sensor ID and metric, and use it as the row key. Perform an export to BigQuery every day.
- Store your data in BigQuery. Use the metric as a primary key.
ユーザの投票
コメント(2)
- 正解だと思う選択肢: B
  B. Store your data in Bigtable. Concatenate the sensor ID and timestamp and use it as the row key. Perform an export to BigQuery every day.
  
  👍 2
  scaenruy2024/01/03
- 正解だと思う選択肢: B
  Bigtable excels at incredibly fast lookups by row key, often reaching single-digit millisecond latencies.
  
  Constructing the row key with sensor ID and timestamp enables efficient retrieval of specific sensor readings at exact timestamps.
  
  Bigtable's wide-column design effectively stores time series data, allowing for flexible addition of new metrics without schema changes.
  
  Bigtable scales horizontally to accommodate massive datasets (petabytes or more), easily handling the expected data growth.
  👍 2
  raaad2024/01/09
シャッフルモード

ユーザの投票

コメント(2)