
Ultimate access to all questions.
After migrating ETL jobs to BigQuery, it is your responsibility to ensure the migrated job outputs match the original job outputs. To do this, you have already loaded a table containing the original job’s output and now need to compare it with the output from the migrated job to confirm they are identical. Given that the tables lack a primary key column necessary for a straightforward join, how should you proceed with the comparison?
A
Select random samples from the tables using the RAND() function and compare the samples.
B
Select random samples from the tables using the HASH() function and compare the samples.
C
Use a Dataproc cluster and the BigQuery Hadoop connector to read the data from each table and calculate a hash from non-timestamp columns of the table after sorting. Compare the hashes of each table.
D
Create stratified random samples using the OVER() function and compare equivalent samples from each table.