
Explanation:
This scenario describes a classic reinforcement learning problem because:
Reward/Penalty System: The AI learns by receiving rewards (for faster routes) and penalties (for delays), which is the fundamental mechanism of reinforcement learning.
Learning from Experience: The system improves decisions by learning from previous deliveries, which aligns with reinforcement learning's trial-and-error approach.
Optimization Goal: The objective is to optimize delivery routes over time through feedback from the environment.
Why other options are incorrect:
Key Reinforcement Learning Concepts in this scenario:
This approach allows the AI to discover optimal strategies through interaction with its environment, making reinforcement learning the most appropriate choice.
Ultimate access to all questions.
No comments yet.
Q1. (Reinforcement Learning Scenario) A logistics company is developing a delivery-route optimization AI that improves decisions by learning from previous deliveries — rewarding faster routes and penalizing delays. Which learning approach best fits this problem?
A
Unsupervised learning
B
Self-supervised learning
C
Reinforcement learning
D
Transfer learning