
Explanation:
To solve this problem, you need to parse the JSON data in the raw column and use the array_max function to find the maximum temperature recorded at 12:00 PM (which corresponds to the 3rd index in the temp array). The correct query is select array_max(from_json(raw:chicago[*].temp[3],‘array‘)) from temp_data. This question is challenging but preparing with similar examples will help. For more practice, refer to Databricks documentation on semi-structured data.
Ultimate access to all questions.
Given a table temp_data with a column raw containing JSON data that records temperatures for every four hours in the day for Chicago, how would you calculate the maximum temperature ever recorded at 12:00 PM across all days? Parse the JSON data and use the necessary array function to find the max temperature.
A
select max(raw.chicago.temp[3]) from temp_data
B
select array_max(raw.chicago[*].temp[3]) from temp_data
C
select array_max(from_json(raw[‘chicago‘].temp[3],‘array‘)) from temp_data
D
select array_max(from_json(raw:chicago[*].temp[3],‘array‘)) from temp_data
E
select max(from_json(raw:chicago[3].temp[3],‘array‘)) from temp_data
No comments yet.