
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
A team is training a warehouse robot using Amazon SageMaker RL. They notice that the robot reaches the destination quickly but takes risky shortcuts that lead to frequent collisions. Which adjustment should be made to improve safety?
A
Reduce the number of training episodes
B
Add a higher penalty for collisions in the reward function
C
Increase the learning rate to speed up exploration
D
Remove penalties to avoid discouraging movement
Explanation:
In reinforcement learning (RL), the reward function is crucial for guiding the agent's behavior. The robot is currently prioritizing speed (reaching the destination quickly) over safety (avoiding collisions). This indicates that the current reward function doesn't sufficiently penalize collisions.
Why option B is correct:
Why other options are incorrect:
In Amazon SageMaker RL, adjusting the reward function is a common technique to shape desired behaviors in reinforcement learning agents.