Ultimate access to all questions.
The data governance team is evaluating code for GDPR compliance regarding record deletion. The following logic is used to propagate delete requests from the user_lookup
table to the user_aggregates
table:
(spark.read
.format("delta")
.option("readChangeData", True)
.option("startingTimestamp", '2021-08-22 00:00:00')
.option("endingTimestamp", '2021-08-29 00:00:00')
.table("user_lookup")
.createOrReplaceTempView("changes"))
spark.sql("""
DELETE FROM user_aggregates
WHERE user_id IN (
SELECT user_id
FROM changes
WHERE _change_type = 'delete'
)
""")
Assuming user_id
is a unique key and all users requesting deletion have been removed from user_lookup
, does successfully executing this logic ensure that the records deleted from user_aggregates
are no longer accessible? Explain why.