You are designing an Azure Databricks table to persist an average of 20 million streaming events daily for use in incremental load pipeline jobs. The solution must minimize both storage costs and incremental load times.
What should you include in the design?