
Microsoft Fabric Analytics Engineer Associate DP-600
Get started today
Ultimate access to all questions.
In the context of a Fabric workspace using the default Spark starter pool and runtime version 1.2, you aim to read a CSV file named Sales_raw.csv located in a lakehouse. You intend to select specific columns and save the filtered data as a Delta table in the managed area of the lakehouse. The CSV file Sales_raw.csv comprises 12 columns. You have the following code snippet:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('example').getOrCreate()
df = spark.read.csv('/path/to/Sales_raw.csv', header=True)
selected_df = df.select('SalesOrderNumber', 'OrderDate', 'CustomerName', 'UnitPrice')
selected_df.write.format('delta').mode('overwrite').partitionBy('OrderDate').save('/path/to/output')
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
- Does the code select only specific columns from the DataFrame?
- Will removing the PartitionBy line result in no performance changes?
- Will adding inferSchema=True result in extra time in execution?
In the context of a Fabric workspace using the default Spark starter pool and runtime version 1.2, you aim to read a CSV file named Sales_raw.csv located in a lakehouse. You intend to select specific columns and save the filtered data as a Delta table in the managed area of the lakehouse. The CSV file Sales_raw.csv comprises 12 columns. You have the following code snippet:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('example').getOrCreate()
df = spark.read.csv('/path/to/Sales_raw.csv', header=True)
selected_df = df.select('SalesOrderNumber', 'OrderDate', 'CustomerName', 'UnitPrice')
selected_df.write.format('delta').mode('overwrite').partitionBy('OrderDate').save('/path/to/output')
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
- Does the code select only specific columns from the DataFrame?
- Will removing the PartitionBy line result in no performance changes?
- Will adding inferSchema=True result in extra time in execution?
Exam-Like