Databricks Certified Generative AI Engineer - Associate

Ultimate access to all questions.

A Generative AI Engineer has developed scalable PySpark code to process unstructured PDF documents and split them into chunks for storage in a Databricks Vector Search index. The resulting DataFrame contains two columns: the original filename as a string and an array of text chunks from that document.

What steps must the Generative AI Engineer take to prepare and store these chunks for ingestion into Databricks Vector Search?

Exam-Like

Last updated: December 19, 2025 at 14:03

Use PySpark’s autoloader to apply a UDF across all chunks, formatting them in a JSON structure for Vector Search ingestion.

12.2%

Flatten the dataframe to one chunk per row, create a unique identifier for each row, and enable change feed on the output Delta table.

Loading comments...