
Ultimate access to all questions.
As a Microsoft Fabric Analytics Engineer Associate working on a project, you are presented with a dataset that contains potential outliers. The project has strict compliance requirements and the dataset is large, making scalability a concern. You need to ensure the analysis is accurate while adhering to the project constraints. Which of the following approaches is the BEST to identify and resolve issues with data outliers in this scenario? (Choose one)
A
Automatically remove all data points that fall outside a specified range or threshold without further analysis, to ensure scalability and compliance.
B
Manually review each outlier in the dataset to determine its validity, despite the time and resource constraints this may impose.
C
Use a combination of automated tools for initial outlier detection and manual review for validation, considering the dataset's distribution, the context of the data, and potential causes of outliers, to balance accuracy with scalability and compliance.
D
Proceed with the analysis ignoring the outliers, assuming they have minimal impact on the overall results, to meet project deadlines.