
Ultimate access to all questions.
A data engineer is working with two tables. Each of these tables is displayed below in its entirety.
sales
| customer_id | spend | units |
|---|---|---|
| a1 | 28.94 | 7 |
| a3 | 874.12 | 23 |
| a4 | 8.99 | 1 |
favorite_stores
| customer_id | store_id |
|---|---|
| a1 | s1 |
| a2 | s1 |
| a4 | s2 |
The data engineer runs the following query to join these tables together:
SELECT
sales.customer_id,
sales.spend,
favorite_stores.store_id
FROM sales
LEFT JOIN favorite_stores
ON sales.customer_id = favorite_stores.customer_id;
SELECT
sales.customer_id,
sales.spend,
favorite_stores.store_id
FROM sales
LEFT JOIN favorite_stores
ON sales.customer_id = favorite_stores.customer_id;
Which of the following will be returned by the above query?
A
| customer_id | spend | store_id |
|---|---|---|
| a1 | 28.94 | s1 |
| a4 | 8.99 | s2 |
B
| customer_id | spend | units | store_id |
|---|---|---|---|
| a1 | 28.94 | 7 | s1 |
| a4 | 8.99 | 1 | s2 |
C
| customer_id | spend | store_id |
|---|---|---|
| a1 | 28.94 | s1 |
| a3 | 874.12 | NULL |
| a4 | 8.99 | s2 |
D
| customer_id | spend | store_id |
|---|---|---|
| a1 | 28.94 | s1 |
| a2 | NULL | s1 |
| a3 | 874.12 | NULL |
| a4 | 8.99 | s2 |
E
| customer_id | spend | store_id |
|---|---|---|
| a1 | 28.94 | s1 |
| a2 | NULL | s1 |
| a4 | 8.99 | s2 |
Explanation:
The correct answer is C because:
sales) and matching rows from the right table (favorite_stores).NULL values are returned for the right table's columns.sales.customer_id, sales.spend, and favorite_stores.store_id.Analysis of each row:
store_id = s1sales table → store_id = NULL (no match in favorite_stores)store_id = s2favorite_stores → NOT included because LEFT JOIN starts from sales tableWhy other options are incorrect:
a3 (should be included with NULL store_id)units column (not selected in query) and missing row for a3a2 (should not be included as it's not in sales table)a2 (should not be included) and missing row for a3