
Answer-first summary for fast verification
Answer: Reinforcement learning
## Explanation This scenario describes a classic **reinforcement learning** problem because: 1. **Reward/Penalty System**: The AI learns by receiving rewards (for faster routes) and penalties (for delays), which is the fundamental mechanism of reinforcement learning. 2. **Learning from Experience**: The system improves decisions by learning from previous deliveries, which aligns with reinforcement learning's trial-and-error approach. 3. **Optimization Goal**: The objective is to optimize delivery routes over time through feedback from the environment. **Why other options are incorrect:** - **A) Unsupervised learning**: This involves finding patterns in data without explicit rewards/penalties (e.g., clustering, dimensionality reduction). - **B) Self-supervised learning**: This uses the structure of the data itself to create supervisory signals, not external rewards/penalties. - **D) Transfer learning**: This involves applying knowledge learned from one task to a different but related task, not optimizing through rewards/penalties. **Key Reinforcement Learning Concepts in this scenario:** - **Agent**: The delivery-route optimization AI - **Environment**: The delivery network and traffic conditions - **Actions**: Choosing different delivery routes - **Rewards**: Positive feedback for faster routes - **Penalties**: Negative feedback for delays This approach allows the AI to discover optimal strategies through interaction with its environment, making reinforcement learning the most appropriate choice.
Author: Ritesh Yadav
Ultimate access to all questions.
No comments yet.
Q1. (Reinforcement Learning Scenario) A logistics company is developing a delivery-route optimization AI that improves decisions by learning from previous deliveries — rewarding faster routes and penalizing delays. Which learning approach best fits this problem?
A
Unsupervised learning
B
Self-supervised learning
C
Reinforcement learning
D
Transfer learning