Code
library(wooldridge)
data("wage2")Status: ported 2026-05-18. Reviewed by editor: pending.
By the end of this chapter the reader should be able to:
summary(), confint(), qt(), qf(), pt(), pf(), anova() and manual arithmetic on the fitted model.Once we hold education, experience and age fixed, do
IQscores and schooling jointly matter for monthly earnings — or could the apparent contribution of cognitive ability and education have arisen by chance?
We met IQ briefly in Chapter 3 as a candidate proxy for unobserved ability. Now we have to decide, formally, whether the coefficients we estimate are far enough from zero to take seriously. The tools of the chapter — \(t\)-statistics, \(p\)-values, confidence intervals and the \(F\)-test — are designed precisely for that decision.
Chapters 2 and 3 produced point estimates \(\hat{\beta}_j\) from a sample. These numbers are random: a different sample would have given different numbers. The question of inference is whether what we learned from our sample lets us say anything credible about the unknown population parameter \(\beta_j\).
Statistical inference provides two complementary tools (Wooldridge 2020):
Both tools rely on knowing the sampling distribution of \(\hat{\beta}_j\). Under Gauss–Markov we only know its mean (\(\beta_j\)) and variance; we do not know its shape. To say anything precise about probabilities — “the chance of observing a coefficient this far from zero, if the truth is zero, is less than 1%” — we need a distribution. That is what the next assumption gives us.
Gauss–Markov gave us unbiasedness, a closed-form variance and the BLUE property of OLS. For exact (finite-sample) inference we need one more assumption.
Conditional on the regressors \(\mathbf{X}\), the population error \(u\) is normally distributed with mean zero and constant variance:
\[ u \mid \mathbf{X} \;\sim\; \mathcal{N}(0,\,\sigma^2). \]
The full set MLR.1–MLR.6 is called the Classical Linear Model (CLM) assumptions.
Why normality, and where does it come from? In many applied settings \(u\) is a sum of a large number of small, independent omitted factors, and a central-limit heuristic suggests its distribution should be roughly bell-shaped. The assumption is strong: it can fail badly for skewed outcomes such as wages or counts. We will see in Chapter 6 that taking \(\log\) of the dependent variable often does a lot to make the residuals look symmetric.
The pay-off is large. Under MLR.1–MLR.6, the OLS estimator inherits the normality of \(u\):
\[ \hat{\beta}_j \mid \mathbf{X} \;\sim\; \mathcal{N}\!\left(\beta_j,\,\operatorname{Var}(\hat{\beta}_j)\right). \]
This is an exact statement that holds in any sample size, not just asymptotically. With \(n\) large, the central limit theorem delivers approximately the same conclusion even without MLR.6, but only the CLM gives us exact small-sample inference.
MLR.6 is a statement about the error \(u\), conditional on the regressors. It is not the claim that the marginal distribution of \(y\) is normal. Wages, for example, are right-skewed; that does not by itself contradict MLR.6, because the systematic component \(\beta_0 + \beta_1 x_1 + \cdots\) can absorb the skewness.
If \(\sigma^2\) were known we could standardise \(\hat{\beta}_j\) directly:
\[ Z \;=\; \frac{\hat{\beta}_j - \beta_j}{\sqrt{\operatorname{Var}(\hat{\beta}_j)}} \;\sim\; \mathcal{N}(0,1). \]
In practice \(\sigma^2\) is unknown and is replaced by the OLS estimator \(\hat{\sigma}^2 = \mathrm{SSR}/(n-k-1)\). Plugging \(\hat{\sigma}^2\) into the standard error introduces an extra source of variability, and the standardised statistic is no longer standard normal: it is \(t\)-distributed with \(n-k-1\) degrees of freedom,
\[ t \;=\; \frac{\hat{\beta}_j - \beta_j}{\operatorname{se}(\hat{\beta}_j)} \;\sim\; t_{\,n-k-1}. \]
The \(t\)-distribution is symmetric, bell-shaped and centred at zero, but it has heavier tails than \(\mathcal{N}(0,1)\) — a reflection of the extra uncertainty from estimating \(\sigma^2\). As the degrees of freedom \(n-k-1\) grow, the tails thin out and \(t_{n-k-1}\) converges to \(\mathcal{N}(0,1)\). For \(n-k-1 \geq 120\) the two distributions are virtually indistinguishable; for small samples the \(t\) is noticeably wider.
That is the whole reason this chapter uses \(t\) critical values like \(2.013\) (for \(\alpha = 0.05\), \(df = 45\)) rather than the familiar \(1.96\) from the normal table.
A \((1-\alpha)\times 100\%\) confidence interval for \(\beta_j\) is
\[ \hat{\beta}_j \;\pm\; t_{\alpha/2,\,n-k-1}\,\cdot\,\operatorname{se}(\hat{\beta}_j), \]
where \(t_{\alpha/2,\,n-k-1}\) is the critical value of the \(t\)-distribution that leaves probability \(\alpha/2\) in each tail.
For the canonical 95% interval, \(\alpha = 0.05\) and the critical value is \(t_{0.025,\,n-k-1}\). With moderately large samples this is close to \(1.96\), but in small samples it can be appreciably larger.
A 95% confidence interval is a statement about the procedure, not about the realised interval. If we drew many independent samples and built one interval from each, about 95% of those intervals would contain the true \(\beta_j\). It is not correct to say that there is a 95% probability that the particular interval \([L, U]\) we computed from our one sample contains \(\beta_j\) — \(\beta_j\) is a fixed (if unknown) number, and the realised interval either covers it or it does not.
Confidence intervals and two-sided hypothesis tests are two sides of the same coin: a \((1-\alpha)\) CI for \(\beta_j\) contains zero if and only if the two-sided \(t\)-test of \(H_0:\beta_j = 0\) at level \(\alpha\) fails to reject.
We test claims about a single coefficient \(\beta_j\) using a standard four-step procedure. Let \(c\) be the hypothesised value (usually \(c = 0\), “the variable does not matter”).
Step 1. State \(H_0\) and \(H_1\). There are three useful forms:
The choice between one- and two-sided is driven by economic theory, not by the data: if theory tells us a priori that \(\beta_j\) cannot be negative, a one-sided test is appropriate.
Step 2. Choose a significance level \(\alpha\) and read off the critical value. Standard choices are \(\alpha = 0.10\), \(0.05\), \(0.01\). The critical value is
Step 3. Compute the \(t\)-statistic.
\[ t \;=\; \frac{\hat{\beta}_j - c}{\operatorname{se}(\hat{\beta}_j)}. \]
For the default \(c = 0\) this reduces to \(t = \hat{\beta}_j / \operatorname{se}(\hat{\beta}_j)\), which is exactly the number R reports in the third column of summary(lm(...)).
Step 4. Compare and conclude.
If \(H_0\) is rejected we say the coefficient is statistically significant at the chosen level. If we fail to reject we say the data are consistent with \(H_0\) — we never “accept” \(H_0\), because the test was designed to detect departures from it, not to confirm it.
A coefficient can be statistically significant (small \(p\)-value) but economically tiny, and a coefficient can be economically large but statistically insignificant in a small sample. Always report both: the magnitude of \(\hat{\beta}_j\) in the units of the problem, and the precision with which it is estimated. A 1% return to one extra year of education and a 10% return are both “significantly different from zero” in a large sample, but only one of them is policy-relevant.
The four-step procedure forces us to fix \(\alpha\) in advance. A more informative alternative is to report the \(p\)-value.
The \(p\)-value is the probability, computed under \(H_0\), of observing a test statistic at least as extreme as the one we got. Equivalently, it is the smallest significance level \(\alpha\) at which we would reject \(H_0\).
For a two-sided test of \(H_0:\beta_j = 0\) with computed statistic \(t\),
\[ p \;=\; 2 \cdot \Pr\!\left(T_{n-k-1} > |t|\right) \;=\; 2\,\bigl[1 - F_t(|t|;\,n-k-1)\bigr], \]
where \(F_t\) is the CDF of the \(t_{n-k-1}\) distribution. For a one-sided test the \(p\)-value is exactly half of this (when the sign of \(\hat{\beta}_j\) agrees with the alternative).
Conventional rules of thumb (Wooldridge 2020):
R reports two-sided \(p\)-values by default in the Pr(>|t|) column of summary(lm(...)), together with significance stars: *** for \(p < 0.001\), ** for \(p < 0.01\), * for \(p < 0.05\), . for \(p < 0.10\).
Statistical significance tells us that \(\hat{\beta}_j\) is unlikely to have arisen by sampling noise if the truth were zero. It says nothing about whether \(x_j\) causes \(y\). Causal interpretation still requires the population assumption \(\mathbb{E}[u \mid \mathbf{X}] = 0\) (MLR.4) — no \(p\)-value, however small, can rescue a regression contaminated by omitted-variable bias or by reverse causality.
Single-coefficient \(t\)-tests are the right tool when we have one parameter in mind. Often, though, the question is whether several coefficients are jointly zero:
educ and IQ jointly irrelevant for wages once we control for hours, experience and age?”exper add anything to the model?”These are joint hypotheses, and they require a joint test. Doing \(q\) separate \(t\)-tests does not answer the joint question, because the size of the combined procedure is no longer \(\alpha\), and because two coefficients can be jointly informative even when each is individually borderline.
Let the unrestricted model be the regression with \(k\) slopes,
\[ y = \beta_0 + \beta_1 x_1 + \cdots + \beta_k x_k + u. \]
Suppose we want to test that the last \(q\) slopes are zero,
\[ H_0:\;\beta_{k-q+1} = \beta_{k-q+2} = \cdots = \beta_k = 0, \]
against the alternative that at least one of them is non-zero. The restricted model imposes \(H_0\) by dropping those \(q\) regressors:
\[ y = \beta_0 + \beta_1 x_1 + \cdots + \beta_{k-q} x_{k-q} + u. \]
Let \(\mathrm{SSR}_U\) and \(\mathrm{SSR}_R\) denote the sum of squared residuals of the unrestricted and restricted models. Because dropping regressors can never reduce the SSR, we always have \(\mathrm{SSR}_R \geq \mathrm{SSR}_U\); the question is whether the gap is big enough to be inconsistent with \(H_0\).
Under \(H_0\) and MLR.1–MLR.6,
\[ F \;=\; \frac{(\mathrm{SSR}_R - \mathrm{SSR}_U)/q}{\mathrm{SSR}_U/(n-k-1)} \;\sim\; F_{q,\,n-k-1}. \]
Reject \(H_0\) at level \(\alpha\) if \(F > F_{q,\,n-k-1,\,\alpha}\), the upper-\(\alpha\) critical value of the \(F\) distribution with \(q\) and \(n-k-1\) degrees of freedom.
An equivalent formulation uses the \(R^2\) of each model:
\[ F \;=\; \frac{(R^2_U - R^2_R)/q}{(1 - R^2_U)/(n-k-1)}. \]
The two formulas are algebraically identical whenever \(y\) is the same in both regressions; they differ only when the restricted model has a different dependent variable (e.g. an \(F\)-test of \(\log y\) versus \(y\), which the formula above does not cover).
A special case is the test that every slope is zero,
\[ H_0:\;\beta_1 = \beta_2 = \cdots = \beta_k = 0. \]
Here the restricted model is the regression on a constant alone, and \(R^2_R = 0\). The \(F\)-statistic collapses to
\[ F \;=\; \frac{R^2/k}{(1-R^2)/(n-k-1)}, \]
which is exactly the number R prints at the bottom line of summary(lm(...)) under “F-statistic”, together with its \(p\)-value.
For a single restriction (\(q = 1\)), the \(F\)-statistic equals the square of the corresponding \(t\)-statistic:
\[ F \;=\; t^2, \qquad F_{1,\,n-k-1} \;=\; \bigl(t_{n-k-1}\bigr)^2. \]
The \(t\)- and \(F\)-tests deliver identical conclusions in this case; the \(F\) machinery is only strictly necessary when \(q \geq 2\).
We work with wage2 from the wooldridge package: a cross-section of \(n = 935\) U.S. men in the 1980 National Longitudinal Survey, with information on monthly earnings (wage), weekly hours (hours), an IQ test score (IQ), years of schooling (educ), years of work experience (exper), tenure with the current employer (tenure) and age. The goal is to translate the four-step inference machinery into R commands and then check that the built-in shortcuts give the same answers.
library(wooldridge)
data("wage2")summary()Start with a simple regression of monthly wage on weekly hours:
model1 <- lm(wage ~ hours, data = wage2)
summary(model1)
Call:
lm(formula = wage ~ hours, data = wage2)
Residuals:
Min 1Q Median 3Q Max
-839.72 -287.21 -52.38 200.46 2131.26
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 981.315 81.575 12.03 <2e-16 ***
hours -0.532 1.832 -0.29 0.772
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 404.6 on 933 degrees of freedom
Multiple R-squared: 9.033e-05, Adjusted R-squared: -0.0009814
F-statistic: 0.08429 on 1 and 933 DF, p-value: 0.7716
The summary() block reports, for each coefficient, the estimate \(\hat{\beta}_j\), its standard error \(\operatorname{se}(\hat{\beta}_j)\), the \(t\)-statistic \(\hat{\beta}_j/\operatorname{se}(\hat{\beta}_j)\), and the two-sided \(p\)-value. At the bottom we see the overall \(F\)-statistic and its \(p\)-value, which test \(H_0: \beta_{\text{hours}} = 0\) in this single-regressor case.
Now move to a richer specification:
model2 <- lm(wage ~ hours + educ + exper + IQ + age, data = wage2)
summary(model2)
Call:
lm(formula = wage ~ hours + educ + exper + IQ + age, data = wage2)
Residuals:
Min 1Q Median 3Q Max
-883.49 -238.11 -47.36 190.09 2144.24
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -778.4403 171.2506 -4.546 6.20e-06 ***
hours -2.5406 1.6811 -1.511 0.13104
educ 52.4505 7.2981 7.187 1.36e-12 ***
exper 10.9390 3.7196 2.941 0.00335 **
IQ 5.2917 0.9383 5.640 2.26e-08 ***
age 14.4836 4.6657 3.104 0.00197 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 368.9 on 929 degrees of freedom
Multiple R-squared: 0.1722, Adjusted R-squared: 0.1678
F-statistic: 38.66 on 5 and 929 DF, p-value: < 2.2e-16
Read the table line by line: educ and IQ both have large \(t\)-statistics and tiny \(p\)-values, so each is individually significant at the 1% level even after controlling for the others. hours is negative and significant (men who work longer hours earn slightly less per week of hours worked, conditional on the other covariates — a hint that hours are correlated with occupational mix). exper and age are statistically indistinguishable from zero in this specification.
qt()Pull out the coefficients and standard errors programmatically, then build a 95% CI by hand.
beta_hat <- coef(model2)
se_hat <- summary(model2)$coefficients[, "Std. Error"]
n <- nobs(model2)
k <- length(beta_hat) - 1 # number of slope coefficients
df <- n - k - 1
t_crit <- qt(0.975, df)
t_crit # critical value t_{0.025, df}[1] 1.962521
The critical value is close to \(1.96\) because the degrees of freedom are large, but not identical. The manual 95% CI for \(\beta_{\text{IQ}}\) is
IQ_LB <- beta_hat["IQ"] - t_crit * se_hat["IQ"]
IQ_UB <- beta_hat["IQ"] + t_crit * se_hat["IQ"]
c(lower = IQ_LB, upper = IQ_UB)lower.IQ upper.IQ
3.450211 7.133145
and for \(\beta_{\text{hours}}\):
hours_LB <- beta_hat["hours"] - t_crit * se_hat["hours"]
hours_UB <- beta_hat["hours"] + t_crit * se_hat["hours"]
c(lower = hours_LB, upper = hours_UB)lower.hours upper.hours
-5.8397528 0.7584696
The built-in shortcut delivers all the intervals in one call:
confint(model2, level = 0.95) 2.5 % 97.5 %
(Intercept) -1114.523218 -442.3574662
hours -5.839753 0.7584696
educ 38.127749 66.7731627
exper 3.639234 18.2387119
IQ 3.450211 7.1331452
age 5.327113 23.6400802
The numbers in the IQ and hours rows match what we computed by hand. Notice that the 95% interval for IQ excludes zero (consistent with the small \(p\)-value in the summary()), while the intervals for exper and age do contain zero (consistent with their non-significance).
Suppose we want to test \(H_0: \beta_{\text{educ}} = 0\) against the two-sided alternative at the 5% level. Step by step:
b_educ <- coef(model2)["educ"]
se_educ <- summary(model2)$coefficients["educ", "Std. Error"]
t_stat <- b_educ / se_educ
p_value <- 2 * pt(-abs(t_stat), df) # two-sided p-value
c(t = t_stat, p = p_value) t.educ p.educ
7.186848e+00 1.361083e-12
summary(model2)$coefficients["educ", c("t value", "Pr(>|t|)")] t value Pr(>|t|)
7.186848e+00 1.361083e-12
The bottom two lines are identical (up to rounding): the manual computation reproduces exactly what R prints. The \(t\)-statistic is far above the 5% critical value of roughly \(1.96\), so we reject \(H_0\). Education has a statistically significant partial effect on monthly earnings even after controlling for hours, experience, IQ and age.
Now the headline question of the chapter: are educ and IQ jointly relevant once hours, exper and age are in the model? Formally,
\[ H_0:\;\beta_{\text{educ}} = \beta_{\text{IQ}} = 0 \quad\text{vs.}\quad H_1:\;\text{at least one of them }\neq 0, \]
so \(q = 2\).
The unrestricted model is model2 above. The restricted model drops educ and IQ:
model3 <- lm(wage ~ hours + exper + age, data = wage2)
summary(model3)
Call:
lm(formula = wage ~ hours + exper + age, data = wage2)
Residuals:
Min 1Q Median 3Q Max
-749.69 -279.16 -48.16 203.20 2208.66
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 224.411 162.071 1.385 0.16649
hours -1.175 1.812 -0.649 0.51665
exper -9.430 3.443 -2.739 0.00628 **
age 27.032 4.838 5.587 3.03e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 398.4 on 931 degrees of freedom
Multiple R-squared: 0.03253, Adjusted R-squared: 0.02941
F-statistic: 10.44 on 3 and 931 DF, p-value: 9.332e-07
Compute the SSRs by hand:
SSR_u <- sum(residuals(model2)^2)
SSR_r <- sum(residuals(model3)^2)
c(SSR_unrestricted = SSR_u, SSR_restricted = SSR_r)SSR_unrestricted SSR_restricted
126414757 147747973
Now the \(F\)-statistic:
q <- 2 # restrictions
df_u <- n - k - 1 # df of the unrestricted model
F_stat <- ((SSR_r - SSR_u) / q) / (SSR_u / df_u)
F_crit <- qf(0.95, df1 = q, df2 = df_u)
p_val <- 1 - pf(F_stat, df1 = q, df2 = df_u)
c(F = F_stat, F_crit_5pct = F_crit, p = p_val) F F_crit_5pct p
78.387043 3.005413 0.000000
The \(F\)-statistic is far above the 5% critical value and the \(p\)-value is essentially zero, so we strongly reject \(H_0\): educ and IQ are jointly significant determinants of wage.
R provides the same test in one line via anova(), which compares two nested models:
anova(model3, model2)Analysis of Variance Table
Model 1: wage ~ hours + exper + age
Model 2: wage ~ hours + educ + exper + IQ + age
Res.Df RSS Df Sum of Sq F Pr(>F)
1 931 147747973
2 929 126414757 2 21333216 78.387 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The F and Pr(>F) columns reproduce the manual computation. Reporting the \(F\)-test from anova() is the recommended workflow; the manual derivation matters because it lets you see exactly what is being compared.
The identity at the end of §4.7 is easy to verify numerically. Test \(H_0: \beta_{\text{educ}} = 0\) both ways:
t_educ <- summary(model2)$coefficients["educ", "t value"]
mR <- lm(wage ~ hours + exper + IQ + age, data = wage2) # drop educ only
ft <- anova(mR, model2)
F_single <- ft$F[2]
c(t_squared = t_educ^2, F = F_single)t_squared F
51.65078 51.65078
The two numbers agree, as the algebra predicts.
Suppose theory tells us that an extra year of experience cannot lower monthly earnings, so the relevant alternative is right-sided: \(H_0: \beta_{\text{exper}} \leq 0\) vs \(H_1: \beta_{\text{exper}} > 0\). The same \(t\)-statistic feeds a different \(p\)-value:
co <- summary(model2)$coefficients["exper", ]
t_exper <- co["t value"]
p_one_sided <- 1 - pt(t_exper, df) # right-tail
p_two_sided <- 2 * pt(-abs(t_exper), df) # default in summary()
c(t = t_exper, p_one_sided = p_one_sided, p_two_sided = p_two_sided) t.t value p_one_sided.t value p_two_sided.t value
2.940922029 0.001676777 0.003353554
When \(t > 0\) and the alternative is on the right, the one-sided \(p\)-value is exactly half the two-sided one. A coefficient that is borderline under the default two-sided test can become clearly significant once we are willing to commit to a sign a priori — but that commitment must come from economics, not from peeking at the data first.
Six short multiple-choice questions. Try each one before opening the answer.
Adding MLR.6 (normality of \(u\)) to the Gauss–Markov assumptions allows us to:
Answer: C. Unbiasedness (A) follows from MLR.1–MLR.4; BLUE (B) from MLR.1–MLR.5; the estimator \(\hat{\sigma}^2\) (D) is defined regardless of MLR.6. Only MLR.6 gives us exact normality of \(\hat{\beta}_j\) and therefore exact \(t\) and \(F\) inference in finite samples.
Under MLR.1–MLR.6 we use \(t_{n-k-1}\) rather than \(\mathcal{N}(0,1)\) for inference on a single coefficient because:
Answer: B. Replacing the unknown \(\sigma\) in the standard error with \(\hat{\sigma}\) introduces extra variability that the \(t\)-distribution accounts for. As \(n - k - 1 \to \infty\), the \(t\) converges to the standard normal.
A 95% confidence interval for \(\beta_j\) that excludes zero implies:
Answer: A. A CI and a two-sided test at the matching level are algebraically equivalent: the CI excludes the null value if and only if the test rejects. The CI says nothing about magnitude (C) or about OLS bias (D), and a single sample cannot certify (B).
A coefficient has a two-sided \(p\)-value of \(0.03\). Which statement is correct?
Answer: A. Recall \(p\) is the smallest \(\alpha\) at which we reject; \(0.01 < 0.03 < 0.05\) places significance between the 1% and 5% levels. Statistical significance carries no information about economic magnitude.
To test \(H_0:\beta_1 = \beta_2 = 0\) jointly in a regression with \(k\) slopes and sample size \(n\), we use:
Answer: B. Separate \(t\)-tests do not control the size of the joint procedure and miss the case in which two regressors are individually weak but jointly informative.
A regressor \(x_j\) has a coefficient with \(p < 0.001\). Which of the following is true?
Answer: D. Section 4.6 is explicit on this point: inference is about ruling out sampling noise, not about ruling out confounding.
Exercise 4.1 ★ — Reading a summary() output. Using the wage2 dataset, estimate the model
\[ \mathrm{wage} = \beta_0 + \beta_1\,\mathrm{educ} + \beta_2\,\mathrm{exper} + \beta_3\,\mathrm{tenure} + u \]
with lm() and inspect summary(). (a) Which coefficients are significant at the 5% level? (b) Report the magnitude and the standard error of \(\hat{\beta}_1\); what is the economic interpretation? (c) State the null hypothesis that the overall \(F\)-statistic at the bottom of summary() tests.
educ, exper and tenure are individually significant at the 5% level. (b) \(\hat{\beta}_1\) is roughly 60 monthly-dollars per extra year of schooling, with a standard error of about 6 — an effect that is both statistically and economically meaningful. (c) The overall \(F\)-statistic tests \(H_0:\beta_1 = \beta_2 = \beta_3 = 0\) (none of the regressors matters) against the alternative that at least one slope is non-zero.Exercise 4.2 ★ — Manual 95% CI. For the same model, build a 95% confidence interval for \(\beta_{\mathrm{educ}}\) from scratch, using coef(), summary()$coefficients[, "Std. Error"] and qt(). Verify your interval against confint(m)["educ", ]. Does the interval contain zero? What conclusion follows for a two-sided \(t\)-test of \(H_0:\beta_{\mathrm{educ}} = 0\) at the 5% level?
m <- lm(wage ~ educ + exper + tenure, data = wage2)
b <- coef(m)["educ"]
se <- summary(m)$coefficients["educ", "Std. Error"]
df <- nobs(m) - length(coef(m)) # n - k - 1
tc <- qt(0.975, df)
ci_manual <- c(b - tc * se, b + tc * se)
confint(m)["educ", ] # should matchThe interval does not contain zero, so the two-sided \(t\)-test of \(H_0:\beta_{\mathrm{educ}} = 0\) at the 5% level rejects — consistent with the (tiny) Pr(>|t|) value in summary().
Exercise 4.3 ★ — Manual \(F\)-test. In the model \(\mathrm{wage} = \beta_0 + \beta_1\,\mathrm{educ} + \beta_2\,\mathrm{exper} + \beta_3\,\mathrm{tenure} + u\) on wage2, test \(H_0:\beta_2 = \beta_3 = 0\) by comparing the SSRs of the unrestricted and the restricted (drop exper and tenure) models. Compute the critical value with qf() at the 5% level and the \(p\)-value with pf(). Verify against anova().
mU <- lm(wage ~ educ + exper + tenure, data = wage2)
mR <- lm(wage ~ educ, data = wage2)
SSR_u <- sum(resid(mU)^2)
SSR_r <- sum(resid(mR)^2)
n <- nobs(mU); k <- length(coef(mU)) - 1; q <- 2
df_u <- n - k - 1
F_stat <- ((SSR_r - SSR_u) / q) / (SSR_u / df_u)
F_crit <- qf(0.95, q, df_u)
p_val <- 1 - pf(F_stat, q, df_u)
anova(mR, mU)The \(F\)-statistic is well above \(F_{2,\,n-k-1,\,0.05}\) and the \(p\)-value is essentially zero. We reject \(H_0\): experience and tenure are jointly significant given education.
Exercise 4.4 ★★ — One-sided test from theory. Economic theory suggests that an extra year of tenure cannot decrease monthly earnings, so the relevant alternative is right-sided: \(H_0:\beta_{\mathrm{tenure}} \leq 0\) vs \(H_1:\beta_{\mathrm{tenure}} > 0\). Using the regression of wage on educ + exper + tenure, compute the one-sided \(p\)-value from summary() output and decide at the 5% level. How does it compare with the default two-sided \(p\)-value that R prints?
A full answer is given in the Instructor Edition.
Exercise 4.5 ★★ — Joint vs individual significance. In the model of Exercise 4.1, add IQ and age as regressors. (a) Are IQ and age individually significant at the 5% level? (b) Are they jointly significant at the 5% level (use anova())? (c) Construct an example, or explain in words, in which two regressors are individually insignificant yet jointly significant. What feature of the data drives this gap?
A full answer is given in the Instructor Edition.
Exercise 4.6 ★★★ — \(F\) from \(R^2\). Show, starting from the SSR-based formula, that whenever the unrestricted and restricted models share the same dependent variable, the \(F\)-statistic can be written as
\[ F \;=\; \frac{(R^2_U - R^2_R)/q}{(1 - R^2_U)/(n-k-1)}. \]
Then verify the identity numerically for the \(F\)-test of \(H_0:\beta_{\mathrm{exper}} = \beta_{\mathrm{tenure}} = 0\) in Exercise 4.3, by pulling the two \(R^2\) values out of summary() and plugging them in. Why does the formula fail if the dependent variable in the restricted model is \(\log(\mathrm{wage})\) rather than \(\mathrm{wage}\)?
A full answer is given in the Instructor Edition.