Microsoft Certified Azure Data Scientist Associate - DP-100

Get started today

Ultimate access to all questions.

Explanation:

The correct answer is B because the question specifies that the dataset consists of multiple large image files that must be streamed directly from its source. The as_mount() method is designed for streaming scenarios where data is accessed directly from the storage source without downloading it to the compute target, which is optimal for large files like images. Option A (to_pandas_dataframe()) is unsuitable as it converts the dataset to a tabular format, which doesn't apply to image files and would require downloading the data. Option C is invalid due to incorrect syntax ('--data-data' instead of '--input-data'). Option D (as_download()) would download the data to the compute target, contradicting the streaming requirement and being inefficient for large files. The community discussion strongly supports B, with upvoted comments emphasizing the streaming requirement and the inefficiency of downloading large image files.

Explanation:

Comments (0)

No comments yet.

You plan to run a Python script as an Azure Machine Learning experiment. The script contains the following code:

from azureml.core import Dataset, Run

run = Run.get_context()
ds = run.input_datasets['images']

from azureml.core import Dataset, Run

run = Run.get_context()
ds = run.input_datasets['images']

You must specify a file dataset as an input to the script. The dataset consists of multiple large image files and must be streamed directly from its source.

You need to write code to define a ScriptRunConfig object for the experiment and pass the ds dataset as an argument.

Which code segment should you use?

Exam-Like

arguments = ['--input-data', ds.to_pandas_dataframe()]

16.7%

arguments = ['--input-data', ds.as_mount()]

50.0%

arguments = ['--data-data', ds]

16.7%

arguments = ['--input-data', ds.as_download()]

16.7%