
Explanation:
Correct Answer: C
Option C (Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery) is the most effective, cost-efficient, and operationally sound approach that best meets all the stated requirements.
Raw posts archiving + reprocessing: Cloud Storage is the ideal, lowest-cost option for storing large volumes of raw data (hundreds of thousands of posts daily). It offers cheap, durable, long-term storage with easy access for future reprocessing (e.g., if you want to re-run with a new NLP model or different features).
Extracted data (topics + sentiment) for analysis: BigQuery is purpose-built for this. It excels at running analytical queries on structured/semi-structured data at scale, with excellent performance and low cost for the types of aggregations and joins needed for dashboards.
Dashboards & sharing: BigQuery integrates natively and seamlessly with Google Data Studio / Looker Studio for building and sharing dashboards. It supports easy sharing with external stakeholders via IAM, authorized views, or BigQuery Analytics Hub.
Minimal cost & fewest steps:
A (Both in BigQuery): Possible (you can store raw data as STRING or JSON columns), but more expensive. BigQuery storage and scanning costs are significantly higher than Cloud Storage for large volumes of raw text. Not ideal for pure archiving/reprocessing.
B (Both in Cloud SQL): Poor choice. Cloud SQL is a relational database for OLTP workloads. It is not designed for hundreds of thousands of daily text records, lacks the analytics performance of BigQuery, and would be more expensive and complex at this scale.
D (Feed directly from source to API → BigQuery): Violates the “store raw posts for archiving and potential reprocessing” requirement. You would lose the original raw data. It also assumes a streaming/direct feed, while the scenario specifies batch loading once per day.
Recommended Architecture Summary:
This pattern is cost-optimized, scalable, and aligns with Google Cloud best practices for this exact use case.
Ultimate access to all questions.
You are tasked with analyzing a substantial volume of social media posts on a daily basis. Your primary objectives include achieving this with minimal cost and using the fewest steps possible. Here are your specific requirements: ✑ Batch-load hundreds of thousands of social media posts once per day. ✑ Utilize the Cloud Natural Language API to process these posts to extract topics and sentiment. ✑ Store the raw social media posts for archiving and potential reprocessing needs. ✑ Develop and share dashboards with stakeholders both within and outside the organization. Given these requirements, you need to determine the most effective strategy for storing the data extracted from the Cloud Natural Language API for analysis, in addition to archiving the raw social media posts for historical reference. What approach should you take?
A
Store the social media posts and the data extracted from the API in BigQuery.
B
Store the social media posts and the data extracted from the API in Cloud SQL.
C
Store the raw social media posts in Cloud Storage, and write the data extracted from the API into BigQuery.
D
Feed to social media posts into the API directly from the source, and write the extracted data from the API into BigQuery.