
Answer-first summary for fast verification
Answer: Remove personally identifiable information (PII) from the customer data before fine-tuning the LLM.
## Detailed Explanation When fine-tuning a large language model (LLM) with sensitive customer data in a regulated industry like banking, the primary concern is preventing the model from learning, memorizing, or potentially exposing private customer information. The question specifically asks for a solution that ensures the model does not reveal any private customer data. ### Analysis of Options: **Option B: Remove personally identifiable information (PII) from the customer data before fine-tuning the LLM.** - **Optimal Choice**: This is the most direct and effective solution. By removing PII (such as names, addresses, account numbers, social security numbers, etc.) from the training dataset before fine-tuning, the model never has access to this sensitive information. This follows the principle of data minimization and prevents the model from learning patterns that could lead to privacy breaches. This approach addresses the root cause by eliminating sensitive data from the training process entirely. **Option A: Use Amazon Bedrock Guardrails.** - **Less Suitable**: While Amazon Bedrock Guardrails are valuable for filtering harmful content and controlling model outputs during inference, they are not designed to prevent the model from learning sensitive information during the fine-tuning process. Guardrails operate at runtime to filter outputs, but they don't address the fundamental issue of sensitive data being incorporated into the model's weights during training. **Option C: Increase the Top-K parameter of the LLM.** - **Incorrect**: The Top-K parameter controls the diversity of generated text by limiting token selection during inference. Increasing Top-K makes outputs more varied but has no relationship to data privacy or preventing the model from learning sensitive information. This parameter adjustment doesn't address the core privacy requirement. **Option D: Store customer data in Amazon S3. Encrypt the data before fine-tuning the LLM.** - **Insufficient**: While encrypting data at rest in Amazon S3 is a good security practice, it doesn't prevent the model from learning sensitive information during fine-tuning. When the data is decrypted for use in the fine-tuning process, the model still has access to the full content, including any PII. Encryption protects data storage but not the learning process itself. ### Key Principles Applied: 1. **Data Minimization**: The most effective privacy protection is to not expose sensitive data to the model in the first place. 2. **Prevention vs. Mitigation**: Removing PII prevents the problem, while other approaches attempt to mitigate risks after the fact. 3. **Regulatory Compliance**: For financial institutions subject to regulations like GDPR, GLBA, or CCPA, data anonymization/de-identification is a recognized best practice. Therefore, **Option B** is the correct solution as it directly addresses the requirement by ensuring sensitive customer information is never presented to the model during fine-tuning, thereby preventing the model from learning or potentially revealing private data.
Ultimate access to all questions.
No comments yet.
Author: LeetQuiz Editorial Team
A bank is using Amazon Bedrock to fine-tune a large language model (LLM) for assisting customers with loan inquiries. The bank must guarantee that the model does not disclose any private customer information.
Which solution fulfills these requirements?
A
Use Amazon Bedrock Guardrails.
B
Remove personally identifiable information (PII) from the customer data before fine-tuning the LLM.
C
Increase the Top-K parameter of the LLM.
D
Store customer data in Amazon S3. Encrypt the data before fine-tuning the LLM.