
Answer-first summary for fast verification
Answer: Use Dataprep by Trifacta to build and maintain the transformation recipes, and execute them on a scheduled basis
The correct answer is A. Dataprep by Trifacta is specifically designed for non-developers to easily build and maintain data transformation recipes using a graphical interface. It supports scheduling the execution of these transformations, making it suitable for handling the monthly data files and their changing schemas. Unlike other options, Dataprep does not require writing code, hence it meets the requirement of enabling non-developer analysts to modify transformations.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
As a Google Professional Data Engineer, you regularly receive CSV files from a third-party vendor on a monthly basis. However, the schema of these files changes every third month, creating a need for a dynamic data cleansing process. Your primary requirements for handling and transforming this data include automating the execution of these transformations on a set schedule, allowing non-developer analysts to easily modify the transformations, and offering a graphical interface for designing these transformations. What should you do?
A
Use Dataprep by Trifacta to build and maintain the transformation recipes, and execute them on a scheduled basis
B
Load each month's CSV data into BigQuery, and write a SQL query to transform the data to a standard schema. Merge the transformed tables together with a SQL query
C
Help the analysts write a Dataflow pipeline in Python to perform the transformation. The Python code should be stored in a revision control system and modified as the incoming data's schema changes
D
Use Apache Spark on Dataproc to infer the schema of the CSV file before creating a Dataframe. Then implement the transformations in Spark SQL before writing the data out to Cloud Storage and loading into BigQuery
No comments yet.