
Ultimate access to all questions.
In the context of optimizing a production demand forecasting pipeline, Z-score normalization is applied to data stored in BigQuery during preprocessing. With new training data added weekly, the pipeline must efficiently handle this incremental data without significant manual intervention or excessive computation time. Considering the need for scalability, cost-effectiveness, and minimal latency in preprocessing, which of the following approaches is the BEST to adopt? Choose the most appropriate option.
A
Leverage the normalizer_fn argument within TensorFlow's Feature Column API for normalization, requiring data to be exported from BigQuery for processing._
B
Implement the normalization algorithm directly in SQL for execution within BigQuery, utilizing its computational resources for preprocessing.
C
Use Apache Spark with the Dataproc connector for BigQuery to normalize the data, involving additional infrastructure setup and management.
D
Perform data normalization using Google Kubernetes Engine, introducing complexity and overhead for container orchestration.