
Explanation:
Correct Answer: D
printSchema(True): By default, printSchema() shows the structure. To see the extended metadata associated with columns (often used in ML or Spark SQL), you must pass True..schema vs .printSchema(): This is a common exam trick. printSchema() is a method that returns None; you cannot extract data from it. schema is an attribute that returns a StructType object, which has a .fields property you can iterate over.Why others are wrong:
.schema like a function.getNames().schema as a standalone function rather than a DataFrame property.Ultimate access to all questions.
You are working with a Spark DataFrame called storesDF read from a nested JSON file. Your task is to:
Which of the following code blocks correctly performs these tasks?
A
storesDF.schema(True)
schema_fields = [f.name for f in storesDF.printSchema.fields]
print("location" in schema_fields)
storesDF.schema(True)
schema_fields = [f.name for f in storesDF.printSchema.fields]
print("location" in schema_fields)
B
storesDF.schema("True")
schema_fields = [f.name for f in storesDF.printSchema.fields]
storesDF.schema("True")
schema_fields = [f.name for f in storesDF.printSchema.fields]
C
schema.printSchema(True)
schema_fields = [f.name for f in storesDF.schema.fields]
schema.printSchema(True)
schema_fields = [f.name for f in storesDF.schema.fields]
D
storesDF.printSchema(True)
schema_fields = [f.name for f in storesDF.schema.fields]
print("location" in schema_fields)
storesDF.printSchema(True)
schema_fields = [f.name for f in storesDF.schema.fields]
print("location" in schema_fields)