Detailed Explanation
Understanding the Requirements
The question focuses on two specific objectives:
- Identify queries that return confidential information - This requires knowing which data is classified as confidential
- Identify users who executed those queries - This requires tracking query execution and user activity
Analysis of Each Option
A: Sensitivity-classification labels applied to columns that contain confidential information
- Optimal Choice: Sensitivity classification labels are metadata tags that identify which columns contain sensitive or confidential data according to organizational policies.
- Why it works: By classifying columns with sensitivity labels, the system can automatically detect when queries access confidential data. This provides the foundation for identifying which queries return confidential information.
- Azure Implementation: In Azure Synapse Analytics, sensitivity classification can be applied to columns to automatically detect and label confidential data.
B: Resource tags for databases that contain confidential information
- Not Suitable: Resource tags are organizational metadata for Azure resources, not for data classification within databases. They help with resource management and billing but don't help identify specific queries that return confidential data.
C: Audit logs sent to a Log Analytics workspace
- Optimal Choice: Audit logs capture detailed information about database activities, including:
- Which queries were executed
- Who executed them (user identity)
- When they were executed
- What data was accessed
- Why it works: By sending audit logs to Log Analytics, you can quickly search, analyze, and create alerts for queries that access confidential data. This directly addresses the requirement to identify both the queries and the users.
D: Dynamic data masking for columns that contain confidential information
- Not Suitable: Dynamic data masking hides confidential data from unauthorized users but doesn't help identify which queries are accessing confidential information. It's a protection mechanism, not a detection or identification tool.
Why A and C are the Correct Combination
- Sensitivity Classification (A) provides the intelligence to know WHAT data is confidential
- Audit Logs (C) provide the tracking to know WHO accessed it and WHEN
Together, these components create a complete solution:
- Sensitivity classification identifies which columns contain confidential data
- Audit logs record all query activity against those classified columns
- When combined, you can quickly query audit logs to find all queries that accessed sensitive columns and identify the users who executed them
Alternative Options Analysis
- B (Resource tags): Only helps at the database level, not at the column/query level
- D (Dynamic data masking): Prevents exposure but doesn't help with identification and tracking
The combination of A and C provides proactive identification (through classification) and comprehensive tracking (through auditing), which directly minimizes the time to identify problematic queries and their users.