Ultimate access to all questions.
You are designing a Spark application to process a large dataset with complex transformations. The final output needs to be stored in a Parquet format for efficient querying in a data lake environment. The dataset is expected to grow over time, and the solution must support high performance for both write and read operations, while also being cost-effective. Considering these requirements, which of the following strategies would you choose and why? (Choose one option.)