Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


A data engineer aims to incrementally ingest JSON data into a Delta table in near real-time. Which method correctly achieves this?





Explanation:

Option A is correct because: It uses Databricks Auto Loader with the recommended syntax: format('cloudFiles') and sets the file format to JSON using .option('cloudFiles.format', 'json'). This allows Databricks to incrementally and efficiently ingest new JSON files from cloud storage into a Delta table in near real-time. The use of .writeStream with a checkpoint location ensures reliable, fault-tolerant streaming ingestion. This approach follows Databricks’ best practices for streaming data ingestion.