Ultimate access to all questions.
You are a Machine Learning Engineer at a utility company tasked with improving the detection of defects in underground electric cables using thermal imaging. Your dataset comprises 10,000 thermal images, with a mere 100 images containing visible defects, making the dataset highly imbalanced. The company requires a robust evaluation method that not only assesses the model's ability to correctly identify defects but also considers the implications of false positives in a real-world maintenance scenario where unnecessary excavations could lead to significant costs and operational disruptions. Given these constraints, which of the following methods is the most effective for evaluating your model's performance on a test dataset? Choose the best option.