
Answer-first summary for fast verification
Answer: spark.table("sales")
## Explanation In PySpark, there are multiple ways to access Delta tables: 1. **spark.table("sales")** - This is the correct method to access a registered table in PySpark. When a Delta table is created, it's typically registered in the Spark catalog, and `spark.table()` can be used to access it. 2. **spark.sql("SELECT * FROM sales")** - While option A shows `SELECT * FROM sales` alone, in PySpark you would need to wrap it in `spark.sql()` to execute SQL queries. 3. **spark.read.table("sales")** - Another valid method for reading tables. 4. **spark.read.format("delta").table("sales")** - For explicitly specifying Delta format. **Why other options are incorrect:** - **A**: `SELECT * FROM sales` alone is not valid PySpark syntax - it needs to be wrapped in `spark.sql()` - **B**: This is false - Delta tables can be accessed from both SQL and PySpark - **C**: `spark.sql("sales")` is incorrect syntax - `spark.sql()` expects a SQL query, not just a table name - **D**: `spark.delta.table("sales")` is not valid PySpark syntax - there's no `spark.delta.table()` method The most straightforward and correct way to access a registered Delta table in PySpark is `spark.table("sales")`.
Author: Keng Suppaseth
Ultimate access to all questions.
A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL. Which of the following commands could the data engineering team use to access sales in PySpark?
A
SELECT * FROM sales
B
There is no way to share data between Pyspark and SQL.
C
spark.sql("sales")
D
spark.delta.table("sales")
E
spark.table("sales")
No comments yet.