AWS Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

You are working on a data processing project that involves analyzing real-time streaming data from IoT devices. The data includes time-series data with high velocity and variability. Describe how you would use Apache Spark to create an ETL pipeline for this use case, and explain the considerations involved in handling time-series data.

Simulated

Last updated: January 24, 2026 at 14:02

Use Apache Spark's batch processing capabilities to process the time-series data at regular intervals, as real-time processing is not required.

0.0%

Use Apache Spark Streaming to create a real-time ETL pipeline, with appropriate data sources, transformations, and sinks to handle the time-series data efficiently, considering time-window operations and data aggregation.

Comments

Loading comments...

Use a traditional database system to store and process the time-series data, as it can handle high velocity and variability more effectively than Apache Spark.