LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Microsoft Fabric Analytics Engineer Associate DP-600

Microsoft Fabric Analytics Engineer Associate DP-600

Get started today

Ultimate access to all questions.


You are working on analyzing customer purchases within a Fabric notebook using PySpark. The analysis involves two primary DataFrames described as follows:

  1. transactions: This DataFrame contains transaction data with 10 million rows and five columns: transaction_id, customer_id, product_id, amount, and date. Each row corresponds to a single transaction.
  2. customers: This DataFrame holds customer details with 1,000 rows and three columns: customer_id, name, and country.

Your task is to join these DataFrames on the customer_id column. It is crucial to minimize data shuffling during this process. You start by writing the following code:

from pyspark.sql import functions as F
results = 

What code should you complete to populate the results DataFrame and achieve the goal of minimal data shuffling?

Exam-Like



Powered ByGPT-5