
Answer-first summary for fast verification
Answer: REFRESH TABLE table_name
The correct answer is to use `REFRESH TABLE table_name`. This command forces Spark to refresh the availability of external files, including any changes. When Spark queries an external table, it caches the associated files for performance, meaning subsequent queries can use the cached files instead of retrieving them again from cloud object storage. However, this caching means Spark won't recognize new files until the `REFRESH` command is executed.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
When defining external tables using formats like CSV, JSON, TEXT, or BINARY, any query on these tables caches the data and its location for performance reasons. This means that within a given Spark session, any new files that arrive after the initial query will not be available. How can we overcome this limitation?
A
UNCACHE TABLE table_name
B
BROADCAST TABLE table_name
C
REFRESH TABLE table_name
D
CACHE TABLE table_name
E
CLEAR CACH table_name
No comments yet.