Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Which of the following functions completes the following code snippet to return a Spark DataFrame in a structured streaming query?

spark.readStream.format("kafka")
  .option("kafka.bootstrap.servers", "...")
  .option("subscribe", "topic")
  ------

spark.readStream.format("kafka")
  .option("kafka.bootstrap.servers", "...")
  .option("subscribe", "topic")
  ------

Exam-Like

Community

LLeetQuiz

.load()

.print()

.return()

.merge()

Explanation:

Explanation

In Apache Spark Structured Streaming, the .load() method is used to create a streaming DataFrame from a data source. Here's why:

.load() - This method triggers the actual loading of data from the specified source (Kafka in this case) and returns a streaming DataFrame that can be used for further transformations and operations.
.print() - This is used for debugging purposes to print the schema of the DataFrame, not for loading data.
.return() - This is not a valid method in Spark DataFrame API for loading data sources.
.merge() - This is used for merging DataFrames in batch processing, not for loading streaming data sources.

The correct sequence for creating a streaming DataFrame from Kafka is:

This streaming DataFrame can then be used with operations like .writeStream to output the processed data to various sinks.

Powered ByGPT-5.2

Loading comments...