
Ultimate access to all questions.
In a data engineering project, you are working with two DataFrames: 'df_orders' containing columns 'order_id', 'customer_id', and 'order_date', and 'df_customers' with columns 'customer_id', 'customer_name', and 'customer_age'. The project requires analyzing customer orders while ensuring all orders are included in the analysis, even if some customer details are missing. Considering the need for a comprehensive analysis that includes all orders, which of the following Spark SQL join operations would you use to achieve this? Choose the best option that meets the project requirements.
A
Perform a left join to include all rows from 'df_orders' and only the matching rows from 'df_customers', with NULL values for non-matching rows from 'df_customers'._
B
Perform an inner join to include only the rows that have matching keys in both 'df_orders' and 'df_customers'.
C
Perform a right join to include all rows from 'df_customers' and only the matching rows from 'df_orders', with NULL values for non-matching rows from 'df_orders'._
D
Perform a full outer join to include all rows from both 'df_orders' and 'df_customers', with NULL values for non-matching rows from either DataFrame.
E
Both A and D are correct depending on the analysis requirements.