Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.

Explanation:

This question tests your knowledge on filtering files in Auto Loader based on their extensions. The correct approach involves using the pathGlobfilter option to specify the file extension pattern (*.csv) and wildcards in the .load() method to navigate the directory structure. The correct code snippet is:

spark.readStream.format("cloudFiles") \
    .option("cloudFiles.format", 'csv') \
    .schema(schema) \
    .option("pathGlobfilter", "*.csv") \
    .load("s3://bucket/*/data")

spark.readStream.format("cloudFiles") \
    .option("cloudFiles.format", 'csv') \
    .schema(schema) \
    .option("pathGlobfilter", "*.csv") \
    .load("s3://bucket/*/data")

For more details, refer to the documentation on filtering files in Auto Loader.

Explanation:

spark.readStream.format("cloudFiles") \
    .option("cloudFiles.format", 'csv') \
    .schema(schema) \
    .option("pathGlobfilter", "*.csv") \
    .load("s3://bucket/*/data")

spark.readStream.format("cloudFiles") \
    .option("cloudFiles.format", 'csv') \
    .schema(schema) \
    .option("pathGlobfilter", "*.csv") \
    .load("s3://bucket/*/data")

For more details, refer to the documentation on filtering files in Auto Loader.

Comments (0)

No comments yet.

A team is executing a streaming query using Auto Loader to fetch CSV files from cloud storage. Which query correctly fetches only the files with a .csv extension from the specified locations: s3://bucket/orders/data/, s3://bucket/employees/data/, s3://bucket/students/data/, s3://bucket/policies/data/?

Real Exam

Last updated: January 20, 2026 at 14:03

spark.readStream.format("cloudFiles")
.option("cloudFiles.format", 'csv')
.schema(schema)
.load("s3://bucket//data/.csv")

14.8%

spark.readStream.format("cloudFiles")
.option("cloudFiles.format", 'csv')
.schema(schema)
.option("pathGlobfilter", ".csv")
.load("s3://bucket//data")

49.3%

spark.readStream.format("cloudFiles")
.option("cloudFiles.format", 'csv')
.schema(schema)
.option("pathFilter", ".csv")
.load("s3://bucket//data")

14.8%

spark.readStream.format("cloudFiles")
.schema(schema)
.load("s3://bucket//data/.csv")

8.1%