
Google Professional Machine Learning Engineer
Get started today
Ultimate access to all questions.
As a data scientist at a manufacturing company, you are tasked with analyzing the company's extensive historical sales data, which consists of hundreds of millions of records. For your exploratory data analysis (EDA), you aim to calculate detailed descriptive statistics such as mean, median, and mode; perform complex statistical tests for hypothesis evaluation; and plot feature variations over time. Your primary goal is to utilize as much of the sales data as possible while optimizing and minimizing computational resources. Given the range of Google Cloud tools available, what should you do?
As a data scientist at a manufacturing company, you are tasked with analyzing the company's extensive historical sales data, which consists of hundreds of millions of records. For your exploratory data analysis (EDA), you aim to calculate detailed descriptive statistics such as mean, median, and mode; perform complex statistical tests for hypothesis evaluation; and plot feature variations over time. Your primary goal is to utilize as much of the sales data as possible while optimizing and minimizing computational resources. Given the range of Google Cloud tools available, what should you do?
Explanation:
The correct answer is C: Use BigQuery to calculate the descriptive statistics. Use Vertex AI Workbench user-managed notebooks to visualize the time plots and run the statistical analyses. BigQuery is well-suited for handling and processing large-scale datasets efficiently, which makes it an ideal choice for computing descriptive statistics for hundreds of millions of records. Vertex AI Workbench offers the flexibility needed for detailed exploratory data analysis, including visualization and complex statistical testing. By utilizing both tools, you can effectively leverage their strengths while minimizing computational resources.