
Answer-first summary for fast verification
Answer: 1. Define a BigLake table. 2. Create a taxonomy of policy tags in Data Catalog. 3. Add policy tags to columns. 4. Process with the Spark-BigQuery connector or BigQuery SQL.
The correct answer is B. Here is the detailed explanation: 1. **BigLake Integration**: BigLake allows you to define tables on top of data in Cloud Storage, providing a bridge between data lake storage and BigQuery's powerful analytics capabilities. This approach is cost-effective and scalable. 2. **Data Catalog for Governance**: Creating a taxonomy of policy tags in Google Cloud's Data Catalog and applying these tags to specific columns in your BigLake tables enables fine-grained, column-level access control. 3. **Processing with Spark and SQL**: The Spark-BigQuery connector allows data scientists to process data using Apache Spark directly against BigQuery (and BigLake tables). This supports both Spark and SQL processing needs. 4. **Scalability into a Data Mesh**: BigLake and Data Catalog are designed to scale and support the data mesh architecture, which involves decentralized data ownership and governance.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
As part of a digital transformation initiative, your organization has migrated its on-premises Apache Hadoop Distributed File System (HDFS) data lake to Google Cloud Storage. The data being stored will be analyzed and processed by the data science team leveraging Apache Spark and SQL technologies. Given the sensitive nature of some of the data, it is crucial to enforce security policies at the column level. The selected solution must be both cost-effective and capable of scaling into a data mesh architecture. What approach should you take to meet these requirements?
A
B
C
D