Google Professional Data Engineer

Get started today

Ultimate access to all questions.

After migrating ETL jobs to BigQuery, it is your responsibility to ensure the migrated job outputs match the original job outputs. To do this, you have already loaded a table containing the original job’s output and now need to compare it with the output from the migrated job to confirm they are identical. Given that the tables lack a primary key column necessary for a straightforward join, how should you proceed with the comparison?

Exam-Like

Select random samples from the tables using the RAND() function and compare the samples.

8.6%

Select random samples from the tables using the HASH() function and compare the samples.

13.6%

Comments

Loading comments...

Use a Dataproc cluster and the BigQuery Hadoop connector to read the data from each table and calculate a hash from non-timestamp columns of the table after sorting. Compare the hashes of each table.

66.7%