Databricks Certified Data Engineer - Associate

Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


In a scenario where you are working as a Data Engineer on Azure Databricks, you need to extract data from a remote database using a JDBC connection. The requirements specify that the solution must be cost-effective, comply with data governance policies, and efficiently handle large volumes of data. Which of the following Spark DataFrame queries would you use to create a table named 'remote_data' from the JDBC connection, ensuring that the solution includes the necessary driver class for the JDBC connection and correctly specifies the URL, table name, username, and password as options? Choose the best option from the following:




Explanation:

Option C is the correct answer because it includes the driver class required for the JDBC connection, which is essential for establishing the connection. It also correctly specifies the URL, table name, username, and password as options. The USING keyword followed by 'JDBC' indicates the data source type, making it clear that the data is being read from a JDBC source. This option meets all the specified requirements, including cost-effectiveness, compliance with data governance policies, and the ability to handle large volumes of data efficiently.