Why Do We Need Interaction Terms?
기초 개념예를 들어 생각해보세요:
• 약의 효과: 같은 약도 나이에 따라 효과가 다릅니다
• 교육의 가치: 교육의 임금 효과는 성별에 따라 다를 수 있습니다
• 광고의 효과: TV 광고의 효과는 소득 수준에 따라 다릅니다
• 학급 규모: 작은 학급의 효과는 영어 학습자 비율에 따라 다를 수 있습니다
이것은 마치 "운동의 효과"를 측정하면서 나이나 건강 상태를 고려하지 않는 것과 같습니다. 20대에게 좋은 운동이 70대에게는 위험할 수 있죠!
기본 선형 회귀에서는:
이 모델에서 $X_1$의 효과($\beta_1$)는 $X_2$의 값과 무관하게 항상 일정합니다.
하지만 interaction term을 추가하면:
이제 $X_1$의 효과는 $X_2$의 값에 따라 달라집니다:
행복도를 예측하는 모델을 생각해봅시다:
- 비가 오면: 행복도 감소 (젖으니까)
- 우산이 있으면: 행복도 증가 (준비되어 있으니까)
- 하지만! 우산의 효과는 날씨에 따라 다릅니다:
- 맑은 날: 우산은 짐만 됩니다 (부정적 효과)
- 비 오는 날: 우산은 생명줄입니다 (매우 긍정적 효과)
이것이 바로 interaction effect입니다!
세 가지 종류의 Interaction Terms
Binary × Binary Interactions
두 개의 dummy variable의 상호작용
예: Gender × Marriage의 임금 효과
Binary × Continuous Interactions
Dummy variable과 continuous variable의 상호작용
예: High English Learners × Student-Teacher Ratio
Continuous × Continuous Interactions
두 개의 continuous variable의 상호작용
예: Education × Experience의 임금 효과
Binary × Binary Interactions
중급Model specification:
여기서 $D_1$과 $D_2$는 binary (0 또는 1) 변수입니다.
$(D_1=0, D_2=0)$, $(D_1=0, D_2=1)$, $(D_1=1, D_2=0)$, $(D_1=1, D_2=1)$
Let's define:
- $HiSTR = 1$ if STR ≥ 20 (large class), 0 otherwise
- $HiEL = 1$ if PctEL ≥ 10% (many English learners), 0 otherwise
Regression result:
네 그룹의 평균 점수:
| Low STR (작은 학급) | High STR (큰 학급) | 차이 | |
|---|---|---|---|
| Low EL | 664.1 (Base) | 664.1 - 1.9 = 662.2 | -1.9 |
| High EL | 664.1 - 18.2 = 645.9 | 664.1 - 18.2 - 1.9 - 3.5 = 640.5 | -5.4 |
| 차이 | -18.2 | -21.7 |
• Low EL schools: 학급 규모 감소 효과 = 1.9점
• High EL schools: 학급 규모 감소 효과 = 5.4점
즉, 영어 학습자가 많은 학교에서 작은 학급의 혜택이 더 큽니다!
이는 개별 관심이 필요한 학생이 많을 때 소규모 학급이 더 효과적이라는 직관과 일치합니다.
Model:
Estimation results:
lwage = 1.342 - 0.131*female + 0.272*married - 0.308*femxmar + ...
(0.053) (0.065) (0.065) (0.082)
Wage differences by group:
| Group | Expected log(wage) | Interpretation |
|---|---|---|
| Single Male (base) | 1.342 | Reference group |
| Single Female | 1.342 - 0.131 = 1.211 | 13.1% lower than single males |
| Married Male | 1.342 + 0.272 = 1.614 | 27.2% higher than single males |
| Married Female | 1.342 - 0.131 + 0.272 - 0.308 = 1.175 | 16.7% lower than single males |
• Marriage premium for males: +27.2%
• Marriage premium for females: +27.2% - 30.8% = -3.6%
• 결혼은 남성에게는 임금 프리미엄을, 여성에게는 패널티를 가져옵니다!
For model: $Y = \beta_0 + \beta_1 D_1 + \beta_2 D_2 + \beta_3 (D_1 \times D_2) + u$
Step 1: Write out all four cases:
- $(D_1=0, D_2=0)$: $E(Y) = \beta_0$
- $(D_1=1, D_2=0)$: $E(Y) = \beta_0 + \beta_1$
- $(D_1=0, D_2=1)$: $E(Y) = \beta_0 + \beta_2$
- $(D_1=1, D_2=1)$: $E(Y) = \beta_0 + \beta_1 + \beta_2 + \beta_3$
Step 2: Calculate differences:
- Effect of $D_1$ when $D_2=0$: $\beta_1$
- Effect of $D_1$ when $D_2=1$: $\beta_1 + \beta_3$
- Difference in effects: $\beta_3$ (the interaction coefficient)
Binary × Continuous Interactions
중급Model specification:
이 모델은 두 개의 회귀선을 만듭니다:
- When $D=0$: $Y = \beta_0 + \beta_2 X$
- When $D=1$: $Y = (\beta_0 + \beta_1) + (\beta_2 + \beta_3) X$
• $\beta_1$: intercept의 차이
• $\beta_3$: slope의 차이
• $\beta_2 + \beta_3$: $D=1$일 때의 slope
Model: $TestScore = \beta_0 + \beta_1 HiEL + \beta_2 STR + \beta_3 (STR \times HiEL) + u$
Results:
Two separate regression lines:
For Low EL schools (HiEL = 0):
$\widehat{TestScore} = 682.2 - 0.97 \times STR$
For High EL schools (HiEL = 1):
$\widehat{TestScore} = 682.2 + 5.6 + (-0.97 - 1.28) \times STR$
$= 687.8 - 2.25 \times STR$
• Low EL schools: STR이 1 감소 → TestScore 0.97점 증가
• High EL schools: STR이 1 감소 → TestScore 2.25점 증가
영어 학습자가 많은 학교에서 작은 학급의 효과가 2배 이상 큽니다!
이는 개별 지도가 필요한 학생이 많을수록 소규모 학급이 더 효과적이라는 것을 시사합니다.
Hypothesis Testing:
- Same slope? $H_0: \beta_3 = 0$
$t = -1.28/0.97 = -1.32$ → Fail to reject - Same intercept? $H_0: \beta_1 = 0$
$t = 5.6/19.5 = 0.29$ → Fail to reject - Same lines? $H_0: \beta_1 = \beta_3 = 0$
$F = 89.94$ (p-value < 0.001) → Reject!
개별 t-test는 유의하지 않지만 joint F-test는 유의합니다.
이는 STR과 STR×HiEL 간의 높은 상관관계 때문입니다.
Binary × Continuous interaction이 만들 수 있는 세 가지 패턴:
(a) Different intercepts, same slope
Model: $Y = \beta_0 + \beta_1 D + \beta_2 X$
평행한 두 직선
(b) Different intercepts, different slopes
Model: $Y = \beta_0 + \beta_1 D + \beta_2 X + \beta_3 (D \times X)$
완전히 다른 두 직선
(c) Same intercept, different slopes
Model: $Y = \beta_0 + \beta_2 X + \beta_3 (D \times X)$
같은 점에서 시작하는 두 직선
Continuous × Continuous Interactions
고급Model specification:
이제 marginal effects가 다른 변수의 값에 따라 변합니다:
Effect of $X_1$:
$\frac{\partial Y}{\partial X_1} = \beta_1 + \beta_3 X_2$
Effect of $X_2$:
$\frac{\partial Y}{\partial X_2} = \beta_2 + \beta_3 X_1$
Model: $TestScore = \beta_0 + \beta_1 STR + \beta_2 PctEL + \beta_3 (STR \times PctEL) + u$
Results:
Effect of class size at different PctEL levels:
| PctEL | Effect of 1 unit increase in STR | Interpretation |
|---|---|---|
| 0% | -1.12 | No English learners: moderate negative effect |
| 10% | -1.12 + 0.0012(10) = -1.108 | Slightly smaller negative effect |
| 20% | -1.12 + 0.0012(20) = -1.096 | Even smaller negative effect |
| 93.3% | -1.12 + 0.0012(93.3) ≈ 0 | No effect of class size! |
영어 학습자 비율이 매우 높으면 (93.3%), 학급 규모가 시험 점수에 영향을 미치지 않습니다.
하지만 이는 데이터 범위를 벗어난 extrapolation이므로 주의해야 합니다!
Model: $Wage = \beta_0 + \beta_1 Educ + \beta_2 Tenure + \beta_3 (Educ \times Tenure) + ...$
Results:
wage = 0.318 + 0.404*educ - 0.147*tenure + 0.0237*educxten + ...
(0.881) (0.069) (0.083) (0.0074)
Return to education at different tenure levels:
예를 들어, 4년의 추가 교육이 임금에 미치는 효과:
• 신입사원 (Tenure = 1): $[0.404 + 0.0237(1)] \times 4 = \$1.72$
• 5년차 (Tenure = 5): $[0.404 + 0.0237(5)] \times 4 = \$2.10$
이는 교육과 경험이 complementary라는 것을 의미합니다:
• 고학력자일수록 경험을 통해 더 많이 배움
• 경험이 많을수록 교육을 더 잘 활용
Joint significance tests:
- $H_0$: educ = educxten = 0
$F = 42.28$, p-value = 0.0000 → Reject - $H_0$: tenure = educxten = 0
$F = 14.22$, p-value = 0.0000 → Reject - $H_0$: educ = tenure = educxten = 0
$F = 35.57$, p-value = 0.0000 → Reject
For model: $Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 (X_1 \times X_2) + u$
Before: $Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 (X_1 \times X_2)$
After: $Y + \Delta Y = \beta_0 + \beta_1 (X_1 + \Delta X_1) + \beta_2 X_2 + \beta_3 [(X_1 + \Delta X_1) \times X_2]$
Subtract to get:
$\Delta Y = \beta_1 \Delta X_1 + \beta_3 X_2 \Delta X_1$
$\frac{\Delta Y}{\Delta X_1} = \beta_1 + \beta_3 X_2$
Multicollinearity in Interaction Models
고급예를 들어:
• STR과 STR×HiEL은 correlated
• Female과 Female×Married도 correlated
• Educ과 Educ×Tenure도 correlated
이것은 마치 "키"와 "키×체중"을 함께 넣는 것과 같습니다.
키가 큰 사람의 "키×체중"은 당연히 클 가능성이 높죠!
증상:
- 개별 계수의 standard error가 크게 증가
- 개별 t-test는 insignificant하지만 joint F-test는 significant
- 계수 추정치가 불안정 (작은 데이터 변화에 민감)
예시: STR과 PctEL interaction
• STR coefficient의 t-stat = -1.90 (insignificant)
• STR×PctEL coefficient의 t-stat = 0.06 (insignificant)
• But joint F-test = 3.89 (p = 0.021) → significant!
해결책:
- Centering: 변수를 평균에서 뺀 값 사용
$(X - \bar{X}) \times (Z - \bar{Z})$ instead of $X \times Z$ - Joint tests 사용: 개별 t-test보다 F-test 선호
- 경제적 의미에 집중: 통계적 유의성보다 효과의 크기와 방향
Female과 Female×Experience를 포함한 회귀분석:
wage = ... + 0.590*educ + 0.057*exper - 0.878*female - 0.056*femxexp
(0.064) (0.016) (0.352) (0.018)
이제 변수를 재정의해봅시다:
new = femxexp - female
wage = ... + 0.590*educ + 0.057*exper - 0.439*female - 0.056*new
(0.064) (0.016) (0.050) (0.018)
• Female의 standard error: 0.352 → 0.050 (크게 감소!)
• t-statistic 개선으로 유의성 증가
• 하지만 모델의 예측력은 동일 (같은 모델의 다른 표현)
Real-World Applications
응용Research Questions:
- Are there nonlinear effects of class size reduction?
- Are there nonlinear interactions between PctEL and STR?
Full Model with Cubic STR and Interactions:
Key Findings:
| Test | F-statistic | p-value | Conclusion |
|---|---|---|---|
| All STR variables = 0 | 5.91 | 0.001 | STR matters! |
| $STR^2$, $STR^3$ = 0 | 5.81 | 0.003 | Nonlinearity exists |
| All interaction terms = 0 | 5.81 | 0.003 | Interactions matter |
• Class size reduction의 효과는 선형이 아님
• 효과는 영어 학습자 비율에 따라 다름
• 정책 입안 시 one-size-fits-all 접근은 부적절
정방향 인과관계:
• 작은 학급 → 더 나은 교육 → 높은 시험 점수
역방향 인과관계:
• 낮은 시험 점수 → 추가 자원 배정 → 작은 학급
이 문제를 해결하지 않으면 OLS 추정치는 biased됩니다!
Simultaneous Causality in Equations:
(a) Causal effect of X on Y: $Y_i = \beta_0 + \beta_1 X_i + u_i$
(b) Causal effect of Y on X: $X_i = \gamma_0 + \gamma_1 Y_i + v_i$
Problem: $corr(X_i, u_i) \neq 0$ because:
- Large $u_i$ → Large $Y_i$
- Large $Y_i$ → Large $X_i$ (if $\gamma_1 > 0$)
- Therefore: $X_i$ and $u_i$ are correlated!
Solutions:
- Randomized experiments: X를 무작위로 배정
- Instrumental variables: Chapter 12에서 배울 예정
- Panel data methods: Fixed effects로 일부 해결 가능
어떤 specification을 선택해야 할까요?
- 경제 이론 (Economic theory):
• 이론이 특정 functional form을 시사하는가?
• Diminishing returns? Complementarity? - 데이터 탐색 (Data exploration):
• Scatter plots으로 패턴 확인
• 다양한 specification 시도 - 통계적 검정 (Statistical tests):
• t-tests for individual terms
• F-tests for groups of terms
• Information criteria (AIC, BIC) - 경제적 의미 (Economic significance):
• 효과의 크기가 실질적으로 중요한가?
• 결과가 합리적인가? - Robustness checks:
• 다양한 specification에서 일관된 결과?
• Out-of-sample prediction
Practice Problems for Exam 2
실전The following regression is estimated for baseball players:
$\ln(\widehat{salary}) = 10.34 - 0.198 \times black - 0.190 \times hispan + 0.0125 \times (black \times percblck)$
$\quad\quad\quad\quad\quad\quad (2.18) \quad (0.125) \quad\quad\quad (0.153) \quad\quad\quad\quad (0.0050)$
$\quad\quad\quad\quad\quad\quad + 0.0201 \times (hispan \times perchisp) + \text{other factors}$
$\quad\quad\quad\quad\quad\quad (0.0098)$
where percblck is the percentage of the city's population that is black, and perchisp is the percentage Hispanic.
(a) How do you interpret the coefficient on black?
(b) What is the salary difference between black and white players in a city with 10% black population?
(c) At what black population percentage do black and white players earn the same?
(d) Test whether Hispanic players earn differently from white players.
(a) Interpretation of black coefficient:
The coefficient -0.198 represents the log salary difference between black and white players in a city with 0% black population. Since the interaction term is included, this coefficient alone doesn't tell the full story.
(b) Salary difference at 10% black population:
$\ln(salary)_{black} - \ln(salary)_{white} = -0.198 + 0.0125(10) = -0.198 + 0.125 = -0.073$
Black players earn approximately 7.3% less than white players in such cities.
(c) Equal salary point:
Set the difference to zero: $-0.198 + 0.0125 \times percblck = 0$
$percblck = 0.198 / 0.0125 = 15.84\%$
At 15.84% black population, there's no racial wage gap.
(d) Testing Hispanic wage difference:
This requires a joint test of $H_0: \beta_{hispan} = \beta_{hispan \times perchisp} = 0$
Individual t-tests may not be reliable due to multicollinearity between hispan and hispan×perchisp.
Need F-test for joint significance.
Consider the wage equation:
$\ln(wage) = \beta_0 + \beta_1 educ + \beta_2 exper + \beta_3 female + \beta_4 (female \times educ) + u$
Estimation results:
$\ln(\widehat{wage}) = -2.27 + 0.626 \times educ + 0.026 \times exper - 0.060 \times female - 0.140 \times (female \times educ)$
$\quad\quad\quad\quad\quad\quad (0.93) \quad (0.071) \quad\quad\quad (0.010) \quad\quad\quad\quad (1.436) \quad\quad\quad\quad\quad (0.120)$
(a) Write separate wage equations for males and females.
(b) Calculate the return to education for males and females.
(c) Test whether education has the same effect for both genders at 5% level.
(d) For a female with 16 years of education, what's the wage penalty compared to a similar male?
(a) Separate equations:
Males (female = 0):
$\ln(wage) = -2.27 + 0.626 \times educ + 0.026 \times exper$
Females (female = 1):
$\ln(wage) = (-2.27 - 0.060) + (0.626 - 0.140) \times educ + 0.026 \times exper$
$= -2.33 + 0.486 \times educ + 0.026 \times exper$
(b) Returns to education:
- Males: 62.6% per year of education
- Females: 48.6% per year of education
- Difference: 14.0 percentage points
(c) Test for equal education effects:
$H_0: \beta_4 = 0$ (female×educ coefficient = 0)
$t = -0.140 / 0.120 = -1.17$
$|t| = 1.17 < 1.96$ → Fail to reject at 5% level
We cannot conclude that education effects differ by gender.
(d) Wage penalty for female with 16 years education:
$\ln(wage)_{female} - \ln(wage)_{male} = -0.060 - 0.140(16) = -0.060 - 2.24 = -2.30$
This seems unreasonably large (230% lower), suggesting possible specification issues or the need to check the standard error for female (1.436 seems high).
A researcher estimates the effect of class size on test scores, allowing for interactions with both English learners and income:
$TestScore = \beta_0 + \beta_1 STR + \beta_2 HiEL + \beta_3 LowInc$
$\quad\quad\quad\quad + \beta_4 (STR \times HiEL) + \beta_5 (STR \times LowInc)$
$\quad\quad\quad\quad + \beta_6 (HiEL \times LowInc) + \beta_7 (STR \times HiEL \times LowInc) + u$
where HiEL = 1 if PctEL ≥ 10%, and LowInc = 1 if average income < \$15,000.
(a) How many different regression lines does this model allow?
(b) Write the marginal effect of STR for each group.
(c) If $\beta_7 > 0$, what does this mean economically?
(d) How would you test whether the STR effect is the same for all groups?
(a) Number of regression lines:
With two binary variables, we have $2 \times 2 = 4$ groups:
- (HiEL = 0, LowInc = 0): High income, few English learners
- (HiEL = 1, LowInc = 0): High income, many English learners
- (HiEL = 0, LowInc = 1): Low income, few English learners
- (HiEL = 1, LowInc = 1): Low income, many English learners
(b) Marginal effects of STR:
| Group | Marginal Effect of STR |
|---|---|
| HiEL = 0, LowInc = 0 | $\beta_1$ |
| HiEL = 1, LowInc = 0 | $\beta_1 + \beta_4$ |
| HiEL = 0, LowInc = 1 | $\beta_1 + \beta_5$ |
| HiEL = 1, LowInc = 1 | $\beta_1 + \beta_4 + \beta_5 + \beta_7$ |
(c) Economic meaning of $\beta_7 > 0$:
The triple interaction $\beta_7 > 0$ means that the combined effect of having both many English learners AND low income creates an additional impact beyond what we'd expect from just adding the two separate interaction effects.
This suggests synergy between the two disadvantages: schools facing both challenges benefit even more from smaller classes than the sum of the individual effects would suggest.
(d) Testing equal STR effects:
$H_0: \beta_4 = \beta_5 = \beta_7 = 0$
This is a joint F-test with 3 restrictions. If we fail to reject, then STR has the same effect ($\beta_1$) for all groups.
A researcher estimates:
$ColGPA = \beta_0 + \beta_1 hsGPA + \beta_2 skipped + \beta_3 bfriend + \beta_4 (bfriend \times skipped) + u$
where bfriend = 1 if student has boyfriend/girlfriend, skipped = average classes skipped per week.
Individual t-tests show:
- bfriend: t = 1.2 (not significant)
- bfriend × skipped: t = -1.5 (not significant)
But the F-test for $H_0: \beta_3 = \beta_4 = 0$ gives F = 8.5 (p < 0.01).
(a) Explain this apparent contradiction.
(b) Should you conclude that having a boyfriend/girlfriend affects GPA?
(c) How would you interpret the interaction term if $\beta_4 < 0$?
(a) Explaining the contradiction:
This is a classic case of multicollinearity. The variables bfriend and bfriend×skipped are highly correlated because:
- bfriend×skipped = 0 whenever bfriend = 0
- bfriend×skipped = skipped whenever bfriend = 1
This correlation inflates standard errors, making individual t-tests less powerful. However, the joint F-test can still detect that together these variables explain significant variation in GPA.
(b) Conclusion about boyfriend/girlfriend effect:
Yes, based on the significant F-test, we should conclude that having a boyfriend/girlfriend affects GPA, but the effect depends on class attendance behavior. We cannot rely on individual t-tests due to multicollinearity.
(c) Interpretation of negative interaction ($\beta_4 < 0$):
If $\beta_4 < 0$, it means:
- The negative effect of skipping class is worse for students with boyfriends/girlfriends
- Or equivalently: Having a boyfriend/girlfriend is more harmful for students who skip classes frequently
- This suggests that relationship distractions compound with poor attendance habits
Exam 2 핵심 체크리스트: Interactions
Interpreting Interactions
• Binary × Binary: 4개 그룹 비교
• Binary × Continuous: 2개의 다른 회귀선
• Continuous × Continuous: Conditional marginal effects
• Always use After-Before method!
Hypothesis Testing
• Individual t-tests may fail due to multicollinearity
• Joint F-tests are more reliable
• Test both individual and joint hypotheses
Common Mistakes to Avoid
• Forgetting to include main effects
• Misinterpreting coefficients in presence of interactions
• Relying only on t-tests when multicollinearity exists
• Not considering economic significance
Practical Tips
• Draw tables for binary interactions
• Calculate marginal effects at different values
• Check for multicollinearity
• Consider centering variables
✓ Interaction term이 있을 때 marginal effect 계산 연습
✓ Binary × Binary는 표로 정리하는 습관
✓ Multicollinearity 상황에서 joint test의 중요성 이해
✓ 실제 데이터의 경제적 해석 능력
✓ After-Before method 완벽 숙지
✓ 계산 실수 방지를 위한 체계적 접근