
Answer-first summary for fast verification
Answer: Extracting the prompt template
## Explanation of the Correct Answer The correct answer is **D: Extracting the prompt template**. ### Why Option D is Correct **Extracting the prompt template** is a specific type of prompt injection attack where an attacker crafts inputs designed to reveal the underlying system instructions, configuration, or base prompt that guides the LLM's behavior. This attack directly targets the model's configured behavior by attempting to: 1. **Expose the foundational instructions** - The attacker seeks to uncover the initial prompt or system message that defines the LLM's persona, constraints, safety guidelines, and operational parameters. 2. **Reveal proprietary configurations** - Many organizations customize LLMs with specific instructions, business rules, or proprietary information embedded in the prompt template. Extracting this exposes how the model has been specifically configured. 3. **Understand behavioral boundaries** - By discovering the prompt template, attackers can learn what restrictions have been placed on the model and potentially find ways to circumvent them. 4. **Gain insight into system architecture** - The prompt template often reveals information about how the LLM has been integrated into a larger system, including any specific guardrails or specialized instructions. This attack is particularly concerning because it directly targets the configured behavior itself, rather than just manipulating outputs. Once an attacker extracts the prompt template, they gain significant understanding of how the LLM has been programmed to respond, which can enable more sophisticated follow-up attacks. ### Analysis of Other Options **A: Prompted persona switches** - This involves manipulating the LLM to adopt a different persona or role than intended. While this is a valid prompt injection technique, it doesn't directly expose the configured behavior; instead, it attempts to override or change that behavior temporarily. **B: Exploiting friendliness and trust** - This technique leverages the LLM's tendency to be helpful and cooperative to extract information or bypass restrictions. While effective for certain attacks, it focuses on manipulating the interaction style rather than directly revealing the underlying configuration or system instructions. **C: Ignoring the prompt template** - This describes a situation where the LLM fails to follow its instructions, but this is typically a failure mode or vulnerability rather than a deliberate attack technique. Attackers don't "ignore" the prompt template; they either exploit it or attempt to extract it. ### Best Practices Consideration From an AWS AI Practitioner perspective, protecting against prompt template extraction involves: - Implementing input validation and sanitization - Using prompt obfuscation techniques - Monitoring for unusual patterns in user queries - Implementing rate limiting and query analysis - Regularly updating and rotating prompt templates when possible This aligns with AWS's security best practices for AI systems, which emphasize protecting the integrity of model configurations and preventing unauthorized access to system instructions.
Ultimate access to all questions.
No comments yet.
Author: LeetQuiz Editorial Team
Which type of prompt attack reveals the underlying system instructions or configured behavior of a large language model (LLM)?
A
Prompted persona switches
B
Exploiting friendliness and trust
C
Ignoring the prompt template
D
Extracting the prompt template