Ultimate access to all questions.
You have a batch processing pipeline for structured data on Google Cloud currently powered by PySpark for data transformations at scale. However, your pipelines require more than twelve hours to execute fully. To improve both development speed and pipeline execution times, you want to transition to a serverless tool using SQL syntax. Note that your raw data has already been transferred to Cloud Storage. What steps would you take to construct the new pipeline on Google Cloud to meet both speed and processing requirements?