
Ultimate access to all questions.
A data engineer has a Job that has a complex run schedule, and they want to transfer that schedule to other Jobs. Rather than manually selecting each value in the scheduling form in Databricks, which of the following tools can the data engineer use to represent and submit the schedule programmatically?
A
pyspark.sql.types.DateType
B
datetime
C
pyspark.sql.types.TimestampType
D
Cron syntax
E
There is no way to represent and submit this information programmatically
Explanation:
Correct Answer: D - Cron syntax
In Databricks, when you need to programmatically represent and submit job schedules, you can use Cron syntax. This is because:
Databricks Jobs API supports Cron expressions: When creating or updating jobs via the Databricks REST API, you can specify the schedule using standard Cron syntax.
Easy transferability: Cron syntax provides a standardized way to represent complex schedules that can be easily copied and reused across different jobs.
Programmatic control: Unlike manually selecting values in the UI form, Cron syntax allows you to define schedules in code, making it suitable for Infrastructure as Code (IaC) practices and automation.
Why the other options are incorrect:
Example of Cron syntax in Databricks API:
{
"schedule": {
"quartz_cron_expression": "0 0 9 * * ?",
"timezone_id": "America/Los_Angeles"
}
}
{
"schedule": {
"quartz_cron_expression": "0 0 9 * * ?",
"timezone_id": "America/Los_Angeles"
}
}
This example schedules a job to run daily at 9:00 AM in the specified timezone.