
Ultimate access to all questions.
A quantitative analyst supporting the acquisitions team of a European corporate real estate firm is using the decision tree technique to create a model for forecasting property prices. The analyst compiles a training data set comprised of information from 10 recent property sales, as shown in the following table:
| Property | Use of site | Occupancy status (Y=occupied) | Expected positive cash flow | Sale price greater than EUR 8,000,000 |
|----------|-------------|------------------------------|----------------------------|----------------------------------------|
| 1 | Office | Y | Y | Y |
| 2 | Retail | N | Y | N |
| 3 | Retail | Y | N | N |
| 4 | Office | N | Y | Y |
| 5 | Retail | N | Y | Y |
| 6 | Retail | N | N | N |
| 7 | Office | N | Y | N |
| 8 | Retail | Y | Y | Y |
| 9 | Retail | N | N | N |
| 10 | Retail | Y | Y | Y |
The table also includes the target variable of the model: a class label indicating whether the property was sold for a price greater than EUR 8,000,000. The analyst selects the occupancy status as the feature that is used as the root node of the decision tree. What is the estimated information gain of the split put forward by this root node?
A
0.08
B
0.37
C
0.44
D
0.82