
Explanation:
Correct Answer: B (Hierarchical Chunking)
Why Hierarchical Chunking is the best solution:
Preserves semantic context across related paragraphs: Hierarchical chunking creates parent-child relationships where parent chunks contain broader context (1,000 tokens) while child chunks contain more granular content (200 tokens). This structure maintains the relationship between different levels of content.
Maintains context at scale: The parent chunks (1,000 tokens) can capture the broader semantic meaning across multiple paragraphs, while the child chunks (200 tokens) allow for more precise retrieval of specific information.
Overlap strategy: The 50-token overlap between chunks helps maintain continuity and prevents loss of context at chunk boundaries.
Why other options are not optimal:
A (Fixed-size chunking):
C (Semantic chunking):
D (No chunking, manual splitting):
Key Takeaway: Hierarchical chunking is particularly effective for knowledge bases because it maintains both granular details (in child chunks) and broader context (in parent chunks), which is essential for preserving semantic relationships across related paragraphs at scale.
Ultimate access to all questions.
No comments yet.
The company needs to improve the knowledge base to preserve semantic context across related paragraphs on the scale of the entire corpus of data.
Which solution will meet these requirements?
A
Configure the knowledge base to use fixed-size chunking. Set a 300-token maximum chunk size and a 10% overlap between chunks. Use an appropriate Amazon Bedrock embedding model.
B
Configure the knowledge base to use hierarchical chunking. Use parent chunks that contain 1,000 tokens and child chunks that contain 200 tokens. Set a 50-token overlap between chunks.
C
Configure the knowledge base to use semantic chunking. Use a buffer size of 1 and a breakpoint percentile threshold of 85% to determine chunk boundaries based on content meaning.
D
Configure the knowledge base not to use chunking. Manually split each document into separate files before ingestion. Apply post-processing reranking during retrieval.