Databricks Certified Generative AI Engineer - Associate

Get started today

Ultimate access to all questions.

What is the most performant way to store a DataFrame in a Vector Search index when the DataFrame has two columns: (1) the original document file name and (2) an array of text chunks for each document?

Exam-Like

Last updated: December 5, 2025 at 14:03

Split the data into train and test set, create a unique identifier for each document, then save to a Delta table

4.9%

Flatten the dataframe to one chunk per row, create a unique identifier for each row, and save to a Delta table

Comments

Loading comments...

Store each chunk as an independent JSON file in Unity Catalog Volume. For each JSON file, the key is the document section name and the value is the array of text chunks for that section

4.9%