Google Professional Data Engineer

Google Professional Data Engineer

Get started today

Ultimate access to all questions.


You are working with a BigQuery dataset that contains customers' street addresses. What is the most effective method to retrieve all instances of street addresses from this dataset?




Explanation:

The correct answer is C: Create a deep inspection job on each table in your dataset with Cloud Data Loss Prevention and create an inspection template that includes the STREET_ADDRESS infoType. This approach is the most effective for accurately identifying and extracting sensitive data like street addresses from your dataset. Cloud Data Loss Prevention (DLP) is specifically designed for discovering, classifying, and protecting sensitive data, making it the ideal tool for this task.

  • Option A is incorrect because creating a discovery scan configuration is more suited for identifying sensitive data across an entire organization, not just within a specific dataset.
  • Option B is incorrect because using REGEXP_CONTAINS to search for the word 'street' is not reliable for accurately retrieving street addresses, as it may miss relevant data or capture irrelevant information.
  • Option D is incorrect because a de-identification job is intended for anonymizing or masking sensitive data to protect privacy, not for retrieving or identifying specific data like street addresses.

In summary, option C provides the most accurate and efficient method for retrieving street addresses from your BigQuery dataset using Cloud Data Loss Prevention's deep inspection capabilities.