Empirical Economics 3

Confidence intervals, t-values, p-values and significance stars

We found the following regression result:

Does that mean we are 95% confident that submitting an additional homework problem set causes an increase in the average exam score between 0.22 and 0.679?

Probably yes

Probably no.

We run the following simulation:

n = 10000
alpha0 = 0; alpha1 = 1; alpha2 = 1
u = rnorm(n,0,1)
x2 = rnorm(n,0,1)
x1 = x2+rnorm(n,0,1)
y = alpha0 + alpha1*x1 + alpha2*x2 + u

Does alpha1=1 measure the causal effect from x1 on y in our simulation?

Yes

One can draw the causal relationships in the data generation process above as follows:

Now assume we estimate the short regression:

$y = \beta_0 + \beta_1 x_1 + \varepsilon$

Will for a large sample size the OLS estimator $\hat \beta_1$ of the short regression converge to the causal effect $\alpha_1$ of $x_1$ on $y$ in our example? In other words is in the short regression $\beta_1^* = \alpha_1$ ?

Yes

Still assume that we generate the data with our simulation and estimate the short regression:

$y = \beta_0 + \beta_1 x_1 + \varepsilon$

Is the OLS $\hat \beta_1$ a consistent estimator of the causal effect $\alpha_1=1$ of $x_1$ on $y$ in our example?

Yes

No and

$\hat \beta_1$ has a negative bias.

No and

$\hat \beta_1$ has a positive bias.

Now assume we would estimate the long regression:

$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \eta$

Is the OLS estimator $\hat \beta_1$ of the long regression a consistent estimator of the causal effect $\alpha_1=1$ of $x_1$ on $y$ in our example?

Yes

No and

$\hat \beta_1$ has a negative bias.

No and

$\hat \beta_1$ has a positive bias.

Assume we estimate again the short regression $y = \beta_0 + \beta_1 x_1 + \varepsilon$ . Is the OLS estimator $\hat \beta_1$ a consistent estimator of $\beta_1$ ?

Yes

It depends on how we define

$\beta_1$ .

Causal graphs and endogeneity

Bias of OLS estimator in the homework exapmle

The causal effects of education

Assuming intelligence is the main confounder and we have some sources of exogenous variation. How would we expect the estimator $\hat \beta_1$ for the causal effect of edu to change if we add the IQ score as control variable?

$\hat \beta_1$ should go down.

$\hat \beta_1$ should go up.

Randomized Experiments

Here are the results of our 3 regressions:

## 
## Please cite as:

##  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.2. https://CRAN.R-project.org/package=stargazer


	Dependent variable:

	ecolbs
	(1)	(2)	(3)

ecoprc	-0.845^**	-2.926^***	-2.949^***
	(0.331)	(0.588)	(0.593)

regprc		3.029^***	3.060^***
		(0.711)	(0.715)

male			-0.108
			(0.227)

inseason			-0.176
			(0.206)

hhsize			0.053
			(0.069)

age			0.001
			(0.007)

faminc			0.003
			(0.003)

Constant	2.388^***	1.965^***	1.703^***
	(0.372)	(0.380)	(0.591)


Observations	660	660	660
R²	0.010	0.036	0.041

Note:	^p<0.1; ^p<0.05; ^**p<0.01

Here are 5 quiz questions related to these regression results. You can find more detailed explanations of the answers in the lecture slides.

a) Part 1: If we have a well randomized experiment, is then the OLS estimator in our second regression $ecolbs = \beta_0 + \beta_1 ecoprc + \beta_2 regprc + u$ consistent?

Yes

a) Part 2: Are the signs of $\hat \beta_1 < 0$ and $\hat \beta_2 > 0$ consistent with what we would expect from economic theory?

Yes

b) If we don't add regprc (see first regression) does the OLS estimator seem to be biased? If yes, in which direction?

No biased

Upward biased

Downward biased

c) Looking at the regression results what is the likely sign of the correlation between the two prices ecoprice and regprice in the experiment?

Roughly zero (probably uncorrelated)

Positive

Negative

d) Assume you were not sure whether the prices were indeed correctly randomized over households, i.e. chosen independently of household characteristics. Which of the following results suggest that we indeed had proper randomization?

The fact that no estimated coefficients for household characteristic is significant in regression 3.

The fact that the coefficient

$\beta_1$ for `ecoprc` does almost not change between regressions 2 and 3.

Interpreting Effect Sizes

Non-linear effects

Heterogeneous effects

Great, you have finished the video lectures for this quite long Chapter 3!

Maybe after a short break, it is a good time to start with the RTutor problem set of this chapter.

Overview and a first regression

Confidence intervals, t-values, p-values and significance stars

Causal graphs and endogeneity

Bias of OLS estimator in the homework exapmle

Are there sources of exogenous variation in our homework example?

The causal effects of education

Randomized Experiments

Interpreting Effect Sizes

Non-linear effects

Heterogeneous effects