
Answer-first summary for fast verification
Answer: Redesign the data ingestion process to use gsutil tool to send the CSV files to a storage bucket in parallel., Assemble 1,000 files into a tape archive (TAR) file. Transmit the TAR files instead, and disassemble the CSV files in the cloud upon receiving them.
The main challenge in this scenario is the latency and the number of small files that need to be transferred, rather than the available bandwidth. Options A and B are not suitable because changing the bandwidth or compressing small files will not have a significant impact on improving the transfer rate. Option C, which involves redesigning the data ingestion process to use the gsutil tool for parallel transfers, effectively addresses the latency issue and improves throughput by sending multiple files simultaneously. Option D involves bundling the files into tape archive (TAR) files which further reduces the overhead associated with transferring a large number of small files, and extracts the individual files in the cloud. Both these actions together will help manage the increased volume of data expected due to seasonality.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Your company generates 20,000 files every hour, each formatted as a comma-separated values (CSV) file, and each file is less than 4 KB in size. To process these files, they must be ingested onto the Google Cloud Platform. Your company site experiences a latency of 200 ms to Google Cloud, and has an Internet connection bandwidth limited to 50 Mbps. Currently, you use a secure FTP (SFTP) server deployed on a virtual machine in Google Compute Engine for data ingestion. A dedicated machine running a local SFTP client is responsible for transmitting the CSV files as they are. The objective is to ensure that reports using data from the previous day are available to executives by 10:00 a.m. each day. Presently, this setup is barely managing the existing volume, despite low bandwidth utilization. Anticipating seasonality, your company expects the number of files to double over the next three months. Which two actions should you take? (Choose two.)
A
Introduce data compression for each file to increase the rate of file transfer.
B
Contact your internet service provider (ISP) to increase your maximum bandwidth to at least 100 Mbps.
C
Redesign the data ingestion process to use gsutil tool to send the CSV files to a storage bucket in parallel.
D
Assemble 1,000 files into a tape archive (TAR) file. Transmit the TAR files instead, and disassemble the CSV files in the cloud upon receiving them.
E
Create an S3-compatible storage endpoint in your network and use Google Cloud Storage Transfer Service to transfer on-premises data to the designated storage bucket.