Topic 1 Question 129

Professional Data Engineer

Topic 1 Question 129
You use BigQuery as your centralized analytics platform. New data is loaded every day, and an ETL pipeline modifies the original data and prepares it for the final users. This ETL pipeline is regularly modified and can generate errors, but sometimes the errors are detected only after 2 weeks. You need to provide a method to recover from these errors, and your backups should be optimized for storage costs. How should you organize your data in BigQuery and store your backups?
- Organize your data in a single table, export, and compress and store the BigQuery data in Cloud Storage.
- Organize your data in separate tables for each month, and export, compress, and store the data in Cloud Storage.
- Organize your data in separate tables for each month, and duplicate your data on a separate dataset in BigQuery.
- Organize your data in separate tables for each month, and use snapshot decorators to restore the table to a time prior to the corruption.
ユーザの投票
コメント(17)
- Should be B
  
  👍 22
  [Removed]2020/03/22
- B The questions is specifically about organizing the data in BigQuery and storing backups.
  
  👍 12
  Ganshank2020/04/13
- 正解だと思う選択肢: B
  B seems the best solution (but C is also good candidate) D is incorrect - table decorators allow time travel back only up to 7 days (see https://cloud.google.com/bigquery/table-decorators) - if you want to keep older snapshots, you would have to save them into separate table yourself (and pay for storage).
  
  👍 6
  MaxNRG2022/01/09
シャッフルモード

ユーザの投票

コメント(17)