
Answer-first summary for fast verification
Answer: Set the regenerate_outputs property of the pipeline to True., Set the allow_reuse property of each step in the pipeline to False.
The question requires ensuring all pipeline steps run every time, bypassing caching. Option B (set regenerate_outputs=True) forces regeneration of all outputs and disables data reuse for the current run. Option C (set allow_reuse=False for each step) prevents step reuse entirely by disabling caching. Both methods directly address the goal: B at the pipeline level and C at the step level. Option A is irrelevant as datastore choice doesn't affect caching. Option D is incorrect as restarting compute doesn't impact step reuse logic. Option E is invalid as 'outputs' property doesn't control execution behavior.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are using the Azure Machine Learning Python SDK to define a multi-step pipeline. You observe that during execution, some steps are skipped and cached output from a previous run is used. You need to ensure all steps run every time, regardless of whether parameters or the source directory contents have changed.
What are two distinct methods to accomplish this? Each correct answer presents a complete solution.
A
Use a PipelineData object that references a datastore other than the default datastore.
B
Set the regenerate_outputs property of the pipeline to True.
C
Set the allow_reuse property of each step in the pipeline to False.
D
Restart the compute cluster where the pipeline experiment is configured to run.
E
Set the outputs property of each step in the pipeline to True.
No comments yet.