
Ultimate access to all questions.
A data engineering team is setting up development, testing, and production environments for a new data pipeline migration. They need thorough testing of both code and resulting data, aiming to use data closely resembling production.
A junior engineer proposes mounting production data directly to development and testing environments, allowing pre-production code to run against real data. Since all users have admin access in the development environment, the junior engineer volunteers to configure permissions and mount the data.
Which statement reflects the most appropriate best practice in this scenario?
A
All development, testing, and production code and data should exist in a single, unified workspace; creating separate environments for testing and development complicates administrative overhead.
B
In environments where interactive code will be executed, production data should only be accessible with read permissions; creating isolated databases for each environment further reduces risks.
C
As long as code in the development environment declares USE dev_db at the top of each notebook, there is no possibility of inadvertently committing changes back to production data sources._
D
Because Delta Lake versions all data and supports time travel, it is not possible for user error or malicious actors to permanently delete production data; as such, it is generally safe to mount production data anywhere.
E
Because access to production data will always be verified using passthrough credentials, it is safe to mount data to any Databricks development environment.