
Answer-first summary for fast verification
Answer: Change WHERE year = 2023 to WHERE extract(year FROM sales_data) = 2023.
If the 'year' field is not accurately parsing or available as a direct condition, extracting the year from the date column (e.g., extracting from a timestamp) correctly applies the filter to process rows for 2023, thus fixing the missing data in the results.
Author: Ritesh Yadav
Ultimate access to all questions.
Question 45 A data engineer is using Amazon Athena to analyze sales data that is in Amazon S3. The data engineer writes a query to retrieve sales amounts for 2023 for several products from a table named sales_data. However, the query does not return results for all of the products that are in the sales_data table. The data engineer needs to troubleshoot the query to resolve the issue.
The data engineer's original query is: SELECT product_name, sum(sales_amount) FROM sales_data WHERE year = 2023 GROUP BY product_name
How should the data engineer modify the Athena query to meet these requirements?
A
Replace sum(sales_amount) with count(*) for the aggregation.
B
Change WHERE year = 2023 to WHERE extract(year FROM sales_data) = 2023.
C
Add HAVING sum(sales_amount) > 0 after the GROUP BY clause.
D
Remove the GROUP BY clause.
No comments yet.