
Ultimate access to all questions.
A Generative AI Engineer has developed scalable PySpark code to process unstructured PDF documents and split them into chunks for storage in a Databricks Vector Search index. The resulting DataFrame contains two columns: the original filename as a string and an array of text chunks from that document.
What steps must the Generative AI Engineer take to prepare and store these chunks for ingestion into Databricks Vector Search?