Videos and questions for Chapter 1 of the course "Empirical Economics with R" at Ulm University (taught by Sebastian Kranz)
We have estimated in R the linear regression:
\[price_t = \beta_0 + \beta_1 temp_t + \beta_2 rainwinter_t + \beta_3 rainharvest_t + u_t\]
Which interpretation of our estimated coefficient \(\hat \beta_1 = 22.3\) is correct?
What about the following interpretation?
An average temperature increase by 1 °C increases a vintage's price index by 22.3 units.
We found the following regression results:
\[ \begin{eqnarray} \hat p_t &=& \hat \beta_0 &+& \hat \beta_1 temp_t &+& \hat \beta_2 rainwinter_t &+& \hat \beta_3 rainharvest_t &+& \hat \beta_4 age_t \\ &=& -323 &+& 19.1 \cdot temp_t &-& 0.1 \cdot rainwinter_t &+& 0.056 \cdot rainharvest_t &+& 0.8 \cdot age_t \end{eqnarray} \]
Compare the estimated coefficient \(\hat \beta_1\) for temperature and \(\hat \beta_4\) for age. Is the following statement correct?
10 additional years of age increase our prediction of a vintage's price index by less than if the vintage had a by 0.5 °C higher average temperature during growing season.
What about the following statement?
This suggests that high enjoyment of drinking a Bordeaux wine is much more strongly affected by the temperature during its growing season than by its age.
Consider the example from the video above. If you want to estimate the total causal effect from obtaining a university degree on wages, should you estimate the short regression:
\[wage = \beta_0^S + \beta_1^S degree + u^S\]
or the long regression where you control for being a manager?
\[wage = \beta_0^L + \beta_1^L degree + + \beta_2^L manager + u^L\]
Ashenfelter et. al. did use as dependent variable not the price index in levels but the natural logarithm of the price index:
\[\log p_t = \beta_0 + \beta_1 temp_t + \beta_2 rainwinter_t + \beta_3 rainharvest_t + \beta_4 age_t + u_t\]
What is the correct interpretation of the estimate \(\hat \beta_1 = 0.616\) in the regression above?
We have shown in the figures also the R-squared of each regression. Consider the following two statements for a regression with dependent variable \(y\):
A: The R-squared is the square of the correlation between \(y_t\) and \(\hat y_t\), i.e. between the actual and and predicted values of the dependent variable.
B: The R-squared measures the share of the variance of \(y_t\) that can be explained by the estimated regression model.
Which statements are correct?
Can we use our estimated model to predict log price indices for new data points for which we only know the weather variables but not the price?
Great, you have finished the video lectures for Chapter 1!
Now would be a great time to start with the RTutor problem set of this chapter.