Topic 1 Question 147
A financial services company receives a regular data feed from its credit card servicing partner. Approximately 5,000 records are sent every 15 minutes in plaintext, delivered over HTTPS directly into an Amazon S3 bucket with server-side encryption. This feed contains sensitive credit card primary account number (PAN) data. The company needs to automatically mask the PAN before sending the data to another S3 bucket for additional internal processing. The company also needs to remove and merge specific fields, and then transform the record into JSON format. Additionally, extra feeds are likely to be added in the future, so any design needs to be easily expandable.
Which solutions will meet these requirements?
Invoke an AWS Lambda function on file delivery that extracts each record and writes it to an Amazon SQS queue. Invoke another Lambda function when new messages arrive in the SQS queue to process the records, writing the results to a temporary location in Amazon S3. Invoke a final Lambda function once the SQS queue is empty to transform the records into JSON format and send the results to another S3 bucket for internal processing.
Invoke an AWS Lambda function on file delivery that extracts each record and writes it to an Amazon SQS queue. Configure an AWS Fargate container application to automatically scale to a single instance when the SQS queue contains messages. Have the application process each record, and transform the record into JSON format. When the queue is empty, send the results to another S3 bucket for internal processing and scale down the AWS Fargate instance.
Create an AWS Glue crawler and custom classifier based on the data feed formats and build a table definition to match. Invoke an AWS Lambda function on file delivery to start an AWS Glue ETL job to transform the entire record according to the processing and transformation requirements. Define the output format as JSON. Once complete, have the ETL job send the results to another S3 bucket for internal processing.
Create an AWS Glue crawler and custom classifier based upon the data feed formats and build a table definition to match. Perform an Amazon Athena query on file delivery to start an Amazon EMR ETL job to transform the entire record according to the processing and transformation requirements. Define the output format as JSON. Once complete, send the results to another S3 bucket for internal processing and scale down the EMR cluster.
ユーザの投票
コメント(6)
- 正解だと思う選択肢: C
Extract Data from S3 + mask + Send to another S3 + Transform/Process + Load into S3 All these are ETL, ELT tasks which should ring Glue
EMR is more focused on big data processing frameworks such as Hadoop and Spark, while Glue is more focused on ETL, More over 5000 records every 15 minutes is not soo big data..So I choose C
👍 11God_Is_Love2023/03/10 - 正解だと思う選択肢: C
C is correct.
👍 4zozza20232023/01/29 - 正解だと思う選択肢: C
C is correct. It will process the data in batch mode using Glue ETL job which can handle large amount of data and can be scheduled to run periodically. This solution is also easily expandable for future feeds.
A: It uses multiple Lambda functions, SQS queue and S3 temporary location which will increase operational overhead. B: Using Fargate may not be the most cost-effective solution and also it may not handle large amount of data. D: Athena and EMR both are powerful tools but they are more complex and can be more costly than Glue.
👍 3masetromain2023/01/16
シャッフルモード