Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A data engineer has developed a data pipeline to ingest data from a JSON source using Auto Loader, but the engineer has not provided any type inference or schema hints in their pipeline. Upon reviewing the data, the data engineer has noticed that all of the columns in the target table are of the string type despite some of the fields only including float or boolean values. Which of the following describes why Auto Loader inferred all of the columns to be of the string type?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:01

There was a type mismatch between the specific schema and the inferred schema

JSON data is a text-based format

Auto Loader only works with string data

All of the fields had at least one null value

Auto Loader cannot infer the schema of ingested data

Explanation:

Explanation

When Auto Loader ingests JSON data without explicit schema hints or type inference, it treats all columns as string type because:

JSON is inherently text-based: JSON (JavaScript Object Notation) stores data as text strings, and Auto Loader's default behavior is to read JSON fields as strings.
Schema inference requires explicit configuration: To get proper type inference (float, boolean, integer, etc.), you need to either:
- Provide an explicit schema
- Use the cloudFiles.inferColumnTypes option set to true
- Use schema hints
The other options are incorrect:
- A: Type mismatch would occur if there was a conflict between provided schema and actual data, but here no schema was provided.
- C: Auto Loader can work with various data types, not just strings, when properly configured.
- D: Null values don't force string type; Auto Loader can infer types even with nulls when inference is enabled.
- E: Auto Loader can infer schema when configured to do so.

Correct Configuration Example:

# To enable type inference with Auto Loader for JSON:
df = spark.readStream.format("cloudFiles") \
    .option("cloudFiles.format", "json") \
    .option("cloudFiles.inferColumnTypes", "true") \
    .load(source_path)

# To enable type inference with Auto Loader for JSON:
df = spark.readStream.format("cloudFiles") \
    .option("cloudFiles.format", "json") \
    .option("cloudFiles.inferColumnTypes", "true") \
    .load(source_path)

Without this configuration, JSON fields are read as strings by default.

Powered ByGPT-5.2

Comments

Loading comments...