
Explanation:
Correct Answer: C
Option C (Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery) is the most effective, cost-efficient, and operationally sound approach that best meets all the stated requirements.
Raw posts archiving + reprocessing: Cloud Storage is the ideal, lowest-cost option for storing large volumes of raw data (hundreds of thousands of posts daily). It offers cheap, durable, long-term storage with easy access for future reprocessing (e.g., if you want to re-run with a new NLP model or different features).
Extracted data (topics + sentiment) for analysis: BigQuery is purpose-built for this. It excels at running analytical queries on structured/semi-structured data at scale, with excellent performance and low cost for the types of aggregations and joins needed for dashboards.
Dashboards & sharing: BigQuery integrates natively and seamlessly with Google Data Studio / Looker Studio for building and sharing dashboards. It supports easy sharing with external stakeholders via IAM, authorized views, or BigQuery Analytics Hub.
Minimal cost & fewest steps:
A (Both in BigQuery): Possible (you can store raw data as STRING or JSON columns), but more expensive. BigQuery storage and scanning costs are significantly higher than Cloud Storage for large volumes of raw text. Not ideal for pure archiving/reprocessing.
B (Both in Cloud SQL): Poor choice. Cloud SQL is a relational database for OLTP workloads. It is not designed for hundreds of thousands of daily text records, lacks the analytics performance of BigQuery, and would be more expensive and complex at this scale.
D (Feed directly from source to API → BigQuery): Violates the “store raw posts for archiving and potential reprocessing” requirement. You would lose the original raw data. It also assumes a streaming/direct feed, while the scenario specifies batch loading once per day.
Recommended Architecture Summary:
This pattern is cost-optimized, scalable, and aligns with Google Cloud best practices for this exact use case.
Ultimate access to all questions.
No comments yet.
NO.22 You want to analyze hundreds of thousands of social media posts daily at the lowest cost and with the fewest steps. You have the following requirements:
A
Store the social media posts and the data extracted from the API in BigQuery.
B
Store the social media posts and the data extracted from the API in Cloud SQL.
C
Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery.
D
Feed to social media posts into the API directly from the source, and write the extracted data from the API into BigQuery.