
Ultimate access to all questions.
In a global gaming company with millions of users, players communicate in real-time via a chat feature supporting over 20 languages, using the Cloud Translation API for real-time message translation. You're tasked with building an ML system to moderate this chat in real-time without altering the existing infrastructure. The initial model used an in-house word2vec model on translated messages, but performance varies significantly across languages due to translation inaccuracies and latency. The company emphasizes the importance of fairness and accuracy across all languages to ensure a positive user experience. Given these constraints, which of the following approaches would BEST improve the model's performance while maintaining real-time moderation? (Choose one correct option)
A
Replace the in-house word2vec model with a pre-trained multilingual BERT model to leverage its deep understanding of context across languages.
B
Train a separate classifier for each language using the original, untranslated chat messages to eliminate translation-induced errors.
C
Implement a hybrid approach that uses the Cloud Translation API for languages with high accuracy and the original messages for languages with poor translation quality.
D
Apply data augmentation techniques to the translated messages to increase the diversity of the training data and improve model robustness.