Ultimate access to all questions.
You are a Machine Learning Engineer at a retail company looking to optimize the ML pipeline for processing structured sales data on Google Cloud. The current pipeline uses PySpark for data transformations after moving raw data into Cloud Storage, but it's taking over 12 hours to run, which is not meeting the business's need for near-real-time analytics. The company requires a serverless solution that leverages SQL syntax for ease of use and efficiency. Additionally, the solution must be cost-effective, scalable, and minimize operational overhead. Which of the following approaches would best meet these requirements? (Choose one correct option)