
Answer-first summary for fast verification
Answer: Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with asynchronous recognition.
According to the Google-recommended best practices, it's crucial to use the native sample rate of the recordings rather than upsampling them. The Speech-to-Text API documentation suggests using audio with a sampling rate of 16 kHz or higher but advises against upsampling as it does not improve accuracy. For longer audio recordings, asynchronous recognition is more suitable as it allows you to submit the audio file and retrieve the transcription results later, which is efficient for batch processing. Therefore, the correct answer is B: Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with asynchronous recognition.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You work at an organization that maintains a cloud-based communication platform that integrates conventional chat, voice, and video conferencing into one platform. The audio recordings of voice calls on this platform are stored in Cloud Storage. These recordings have an 8 kHz sample rate and typically exceed one minute in duration. The organization aims to introduce a new feature that will automatically transcribe these voice call recordings into text. This transcription will be used in future applications like call summarization and sentiment analysis. What is the best way to implement the voice call transcription feature while adhering to Google-recommended best practices?
A
Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with synchronous recognition.
B
Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with asynchronous recognition.
C
Upsample the audio recordings to 16 kHz, and transcribe the audio by using the Speech-to-Text API with synchronous recognition.
D
Upsample the audio recordings to 16 kHz, and transcribe the audio by using the Speech-to-Text API with asynchronous recognition.
No comments yet.