Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
How should you model your lakehouse architecture to efficiently handle a high-volume, write-heavy IoT workload with millions of devices reporting every minute, while ensuring the ability to query data in near real-time?
A
Store raw data in a NoSQL database for write efficiency, periodically ETLing processed data into the lakehouse for analytical queries.
B
Implement a sharded approach, creating separate tables for subsets of devices, and use a metastore to track shards for querying.
C
Utilize a single wide table to store all IoT data, partitioned by device ID, leveraging Delta Lake‘s optimization for concurrent writes.
D
Apply a micro-batching technique that combines streaming ingestion with periodic optimization (compaction and indexing) of the stored data for analysis.