
Explanation:
When selecting a Stream Analytics output format for this scenario, we must consider three key requirements:
A: JSON
B: Parquet
C: CSV
D: Avro
Cross-Platform Compatibility: Parquet is the only format that provides excellent native support for both PolyBase and Databricks without requiring additional configuration or schema mapping.
Performance Advantages: As a columnar storage format, Parquet enables:
Data Integrity: Parquet files contain embedded schema information that preserves data types and ensures consistent interpretation across different query engines.
Stream Analytics Integration: Azure Stream Analytics can efficiently output data in Parquet format to Azure Data Lake Storage, making it a seamless choice for this data pipeline architecture.
The combination of broad compatibility, superior query performance, and reliable data type preservation makes Parquet the clear recommendation for this scenario.
Ultimate access to all questions.
No comments yet.
You are designing a solution to ingest streaming social media data with Azure Stream Analytics. The data will be stored in Azure Data Lake Storage and later queried by both Azure Databricks and PolyBase in Azure Synapse Analytics.
You need to recommend a Stream Analytics output format that minimizes query errors for both Databricks and PolyBase. The solution must prioritize fast query performance and preserve data type information.
What output format should you recommend?
A
JSON
B
Parquet
C
CSV
D
Avro