
Explanation:
Option B is the most comprehensive and scalable approach. It leverages automation for data source discovery and machine learning for accurate data classification, which is crucial for a multinational corporation with diverse data sources. Including detailed metadata such as data lineage, usage statistics, and access controls ensures the catalog is robust and useful for various business needs.
Ultimate access to all questions.
No comments yet.
You are tasked with creating a data catalog for a multinational corporation that operates in various sectors including finance, healthcare, and retail. The data catalog needs to support multiple data sources, including on-premises databases, cloud storage, and third-party APIs. Describe the steps you would take to create this data catalog, including how you would handle data classification and the components of metadata you would include.
A
Start by identifying all data sources, then manually classify each data type and create metadata entries for each source.
B
Automate the discovery of data sources, use machine learning for data classification, and include detailed metadata such as data lineage, usage statistics, and access controls.
C
Focus only on the finance sector data, classify it manually, and include basic metadata like data type and source location.
D
Ignore data classification and metadata, simply list all data sources in a centralized repository.