
Ultimate access to all questions.
In a large-scale data center, your team is tasked with developing a predictive maintenance solution to preempt server failures using unlabeled monitoring data. The solution must be scalable, cost-effective, and capable of handling real-time data streams. Given these constraints, what is the most appropriate initial step to take? Choose one correct option.
A
Hire a team of data analysts to manually label historical performance data and use this dataset to train a supervised learning model.
B
Implement a complex deep learning model to directly analyze the unlabeled data and predict failures without any initial labeling.
C
Apply a simple statistical method, such as calculating z-scores, to automatically label the data as 'normal' or 'anomalous' based on historical performance, then use this labeled data to train a supervised learning model.
D
Use an unsupervised learning algorithm to cluster the data into 'normal' and 'anomalous' groups without any initial labeling, and then monitor these clusters for changes over time.