
Ultimate access to all questions.
A data engineer is refactoring DLT code that contains multiple table definitions with similar patterns:
@dlt.table(name="t1_dataset")
def t1_dataset():
return spark.read.table("t1")
@dlt.table(name="t2_dataset")
def t2_dataset():
return spark.read.table("t2")
@dlt.table(name="t3_dataset")
def t3_dataset():
return spark.read.table("t3")
@dlt.table(name="t1_dataset")
def t1_dataset():
return spark.read.table("t1")
@dlt.table(name="t2_dataset")
def t2_dataset():
return spark.read.table("t2")
@dlt.table(name="t3_dataset")
def t3_dataset():
return spark.read.table("t3")
They attempt to parameterize the table creation using this loop:
tables = ["t1", "t2", "t3"]
for t in tables:
@dlt.table(name=f"{t}_dataset")
def new_table():
return spark.read.table(t)
tables = ["t1", "t2", "t3"]
for t in tables:
@dlt.table(name=f"{t}_dataset")
def new_table():
return spark.read.table(t)
After running the pipeline with this refactored code, the DAG displays incorrect configuration values for these tables. What should the data engineer do to correct this?
A
Wrap the for loop inside another table definition, using generalized names and properties to replace with those from the inner table definition.
B
Convert the list of configuration values to a dictionary of table settings, using table names as keys.
C
Move the table definition into a separate function, and make calls to this function using different input parameters inside the for loop.
D
Load the configuration values for these tables from a separate file, located at a path provided by a pipeline parameter.