Ultimate access to all questions.
A data engineer has set up a daily job at 5 PM that runs a notebook named show_regular.py
. This notebook contains 5 cells, each outputting data as a PySpark DataFrame with the following sizes: Cell 1: 5.2 MB, Cell 2: 6.1 MB, Cell 3: 4.4 MB, Cell 4: 5.3 MB, Cell 5: 2.5 MB. The job fails on its first run. What data constraint caused this failure?