Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A data engineer has developed a data pipeline to ingest data from a JSON source using Auto Loader, but the engineer has not provided any type inference or schema hints in their pipeline. Upon reviewing the data, the data engineer has noticed that all of the columns in the target table are of the string type despite some of the fields only including float or boolean values.

Which of the following describes why Auto Loader inferred all of the columns to be of the string type?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:03

There was a type mismatch between the specific schema and the inferred schema

JSON data is a text-based format

Auto Loader only works with string data

All of the fields had at least one null value

Auto Loader cannot infer the schema of ingested data

Explanation:

Explanation

Correct Answer: B - JSON data is a text-based format

When Auto Loader ingests JSON data without explicit schema hints or type inference, it treats all columns as strings because JSON is fundamentally a text-based format. Here's why:

JSON Format Characteristics: JSON (JavaScript Object Notation) stores data as text strings. Even numeric values like 123.45 or boolean values like true/false are represented as text in JSON files.
Auto Loader's Default Behavior: Without explicit schema information, Auto Loader performs schema inference by sampling the data. However, since JSON stores everything as text, Auto Loader's default inference often results in string types for all columns.
Why Other Options Are Incorrect:
- A: Type mismatch between specific and inferred schema - This would cause errors, not all-string inference.
- C: Auto Loader only works with string data - False, Auto Loader can handle various data types when proper schema is provided.
- D: All fields had at least one null value - While null values can affect type inference, they don't force all columns to be strings.
- E: Auto Loader cannot infer schema - False, Auto Loader can infer schema, but for JSON it defaults to strings.
Solution: To get proper data types, the data engineer should:
- Provide an explicit schema using .schema()
- Use schema hints with cloudFiles.schemaHints
- Use cloudFiles.schemaEvolutionMode with appropriate settings
- Apply explicit casting after ingestion

This behavior is specific to JSON format and is a common consideration when working with Auto Loader for JSON data ingestion.

Powered ByGPT-5.2

Comments

Loading comments...