
Answer-first summary for fast verification
Answer: Add a higher penalty for collisions in the reward function
## Explanation In reinforcement learning (RL), the **reward function** is crucial for guiding the agent's behavior. The robot is currently prioritizing speed (reaching the destination quickly) over safety (avoiding collisions). This indicates that the current reward function doesn't sufficiently penalize collisions. **Why option B is correct:** - Adding a higher penalty for collisions in the reward function will make the robot learn to avoid collisions more effectively - The robot will learn that taking risky shortcuts with collisions results in lower overall rewards - This encourages the robot to find safer paths even if they take slightly longer **Why other options are incorrect:** - **A: Reduce the number of training episodes** - This would likely worsen performance as the robot needs more training to learn safe behaviors - **C: Increase the learning rate to speed up exploration** - A higher learning rate might cause instability and doesn't directly address the safety issue - **D: Remove penalties to avoid discouraging movement** - This would make the problem worse by encouraging even more risky behavior In Amazon SageMaker RL, adjusting the reward function is a common technique to shape desired behaviors in reinforcement learning agents.
Author: Ritesh Yadav
Ultimate access to all questions.
A team is training a warehouse robot using Amazon SageMaker RL. They notice that the robot reaches the destination quickly but takes risky shortcuts that lead to frequent collisions. Which adjustment should be made to improve safety?
A
Reduce the number of training episodes
B
Add a higher penalty for collisions in the reward function
C
Increase the learning rate to speed up exploration
D
Remove penalties to avoid discouraging movement
No comments yet.