
Ultimate access to all questions.
You are working on a data processing project that involves analyzing large volumes of financial transaction data. The data includes sensitive information, such as customer identifiers and account details. Describe how you would use Apache Spark to create a secure ETL pipeline for this use case, and explain the considerations involved in protecting sensitive data throughout the pipeline.
A
Use Apache Spark's built-in functions to process the financial transaction data without any security measures, as it is not required for this use case.
B
Design a secure ETL pipeline using Apache Spark, incorporating data encryption, access control, and data masking techniques to protect sensitive information throughout the pipeline.
C
Use a traditional database system to store and process the financial transaction data, as it provides better security features than Apache Spark.
D
Focus only on the ETL process and ignore the security aspect, as it is not relevant to data processing.