
Ultimate access to all questions.
A data engineer has set up a daily job at 5 PM that runs a notebook named show_regular.py. This notebook contains 5 cells, each outputting data as a PySpark DataFrame with the following sizes: Cell 1: 5.2 MB, Cell 2: 6.1 MB, Cell 3: 4.4 MB, Cell 4: 5.3 MB, Cell 5: 2.5 MB. The job fails on its first run. What data constraint caused this failure?_
A
The output size of the first cell cannot exceed 5 MB in size, forcing the job to fail.
B
The reason for the job failure cannot be determined solely by the size of the output as there is no limitation on the output size of the cells in a notebook.
C
The individual cell output is limited to 6 MB which caused the job to fail.
D
The output size of all the cells in a notebook should not exceed 20 MB. Since the total size of all the cells combined is more than 20 MB, the job failed.
E
The job failed due to multiple reasons as the size of any individual cell output in a notebook cannot increase by 6 MB and the total size of all the outputs cannot be more than 18 MB.