Topic 1 Question 207
You are collecting IoT sensor data from millions of devices across the world and storing the data in BigQuery. Your access pattern is based on recent data, filtered by location_id and device_version with the following query:
You want to optimize your queries for cost and performance. How should you structure your data?
Partition table data by create_date, location_id, and device_version.
Partition table data by create_date, cluster table data by location_id, and device_version.
Cluster table data by create_date, location_id, and device_version.
Cluster table data by create_date, partition by location_id, and device_version.
ユーザの投票
コメント(5)
- 正解だと思う選択肢: B
The best answer is B. Partition table data by create_date, cluster table data by location_id, and device_version.
Here's a breakdown of why this structure is optimal:
Partitioning by create_date:
Aligns with query pattern: Filters for recent data based on create_date, so partitioning by this column allows BigQuery to quickly narrow down the data to scan, reducing query costs and improving performance. Manages data growth: Partitioning effectively segments data by date, making it easier to manage large datasets and optimize storage costs. Clustering by location_id and device_version:
Enhances filtering: Frequently filtering by location_id and device_version, clustering physically co-locates related data within partitions, further reducing scan time and improving performance.
👍 2e70ea9e2023/12/30 - 正解だと思う選択肢: B
Answer is B:
- Partitioning the table by create_date allows us to efficiently query data based on time, which is common in access patterns that prioritize recent data.
- Clustering the table by location_id and device_version further organizes the data within each partition, making queries filtered by these columns more efficient and cost-effective.
👍 2raaad2024/01/02 - 正解だと思う選択肢: B
Only correct answer is B, you can only partition by one field, and you can only cluster on partitioned tables
👍 1Smakyel792024/01/07
シャッフルモード