Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Question 28 A data engineering team is in the process of converting their existing data pipeline to utilize Auto Loader for incremental processing in the ingestion of JSON files. One data engineer comes across the following code block in the Auto Loader documentation:

(streaming_df = spark.readStream.format("cloudFiles")
 .option("cloudFiles.format", "json")
 .option("cloudFiles.schemaLocation", schemaLocation)
 .load(sourcePath))

(streaming_df = spark.readStream.format("cloudFiles")
 .option("cloudFiles.format", "json")
 .option("cloudFiles.schemaLocation", schemaLocation)
 .load(sourcePath))

Assuming that schemaLocation and sourcePath have been set correctly, which of the following changes does the data engineer need to make to convert this code block to use Auto Loader to ingest the data?

Real Exam

Community

LLeetQuiz

The data engineer needs to change the format("cloudFiles") line to format("autoLoader").

There is no change required. Databricks automatically uses Auto Loader for streaming reads.

There is no change required. The inclusion of format("cloudFiles") enables the use of Auto Loader.

The data engineer needs to add the .autoLoader line before the .load(sourcePath) line.

There is no change required. The data engineer needs to ask their administrator to turn on Auto Loader.

Explanation:

Explanation

The correct answer is C because the code block shown is already using the proper Auto Loader syntax.

Key Points:

format("cloudFiles") is the correct format specification for Auto Loader in Databricks
cloudFiles.format option specifies the file format (JSON in this case)
cloudFiles.schemaLocation option is used for schema evolution in Auto Loader
The code already follows the standard Auto Loader pattern for streaming ingestion

Why other options are incorrect:

A: Auto Loader uses "cloudFiles" format, not "autoLoader"
B: Auto Loader is not automatically used for all streaming reads - it requires specific configuration
D: There is no .autoLoader method in the Auto Loader API
E: Auto Loader doesn't require administrator intervention to enable - it's available by default

Auto Loader Syntax:

spark.readStream.format("cloudFiles")
  .option("cloudFiles.format", "<file-format>")
  .option("cloudFiles.schemaLocation", "<schema-location>")
  .load("<source-path>")

spark.readStream.format("cloudFiles")
  .option("cloudFiles.format", "<file-format>")
  .option("cloudFiles.schemaLocation", "<schema-location>")
  .load("<source-path>")

The provided code is already correctly configured for Auto Loader incremental processing.

Powered ByGPT-5.2

Comments

Loading comments...