A company utilizes Amazon EC2 instances for ingesting JSON-formatted data at rates up to 1 MB/s from on-premises sources. Data is lost upon instance reboot. The team requires scalable, near-real-time querying with minimal data loss. What solution meets these criteria?