Differences between revisions 3 and 4
Revision 3 as of 2025-08-06 17:18:31
Size: 2736
Comment: Content
Revision 4 as of 2025-08-06 18:09:17
Size: 3673
Comment: Notes
Deletions are marked like this. Additions are marked like this.
Line 50: Line 50:
where ''λ,,i,,'' is specifically shorthand for ''λ'' evaluated for a given ''γZ,,i,,/√σ,,2,2,,''.

Adding these omitted variables leads to a model specified as:
 * ''Y,,1i,, = bX,,i,, + (σ,,1,2,,/√σ,,2,2,,)λ,,i,, + V,,1i,,''
 * ''Y,,2i,, = γZ,,i,, + (σ,,2,2,,/√σ,,2,2,,)λ,,i,, + V,,2i,,''
 * ''E[V,,2,,^2^] = σ,,2,2,,(1 + φ,,i,,λ,,i,, - λ,,i,,^2^)''
   * as ''φ,,i,,'' goes to infinity (i.e., the chance of selection approaches 100%), this term approaches 0.
 * ''E[V,,1,,V,,2,,] = σ,,1,2,,(1 + φ,,i,,λ,,i,, - λ,,i,,^2^)''
   * as ''φ,,i,,'' goes to infinity, this term approaches 0.
 * ''E[V,,1,,^2^] = σ,,1,1,,[(1 - ρ^2^) + ρ^2^(1 + φ,,i,,λ,,i,, - λ,,i,,^2^)]''
   * as ''φ,,i,,'' goes to infinity, this term approaches ''σ,,1,1,,(1 - ρ^2^)''.

where ''φ,,i,,'' is shorthand for ''φ'' evaluated for a given ''γZ,,i,,/√σ,,2,2,,''; and ''ρ = σ,,1,2,,/√(σ,,1,1,,σ,,2,2,,)''.

Censored and Truncated Regression Models

A censored regression model is appropriate when the dependent variable is unavailable is above or below some threshold.

A truncated regression model is appropriate when cases are systemically not collected/unreported when the dependent variable is above or below some threshold.

The Tobit model, named for Tobin (1958), is a special case of a censored regression model.


Description

This is a modification of the OLS model, where the dependent variable Y is related to the independent variable(s) X as Yi = bXi + Ui.

Univariate

Suppose that the variable of interest is unobserved if it is less than zero. The expected value is then expressed as E[Yi|Xi,Yi≥0]. Substituting Yi with the model equation yields E[bXi + Ui|Xi,bXi + Ui≥0], and because the expectation is conditioned on a given Xi this simplifies to bXi + E[Ui|Xi,bXi + Ui≥0]. Algebraically this is rewritten as:

expectation1.svg

where σ is the standard deviation of the error term Ui. The insertion of that standard deviation term transforms this into a formula that is easily decomposed into terms of the p.d.f. and c.d.f. of the standard normal distribution. Altogether, the expected value is:

expectation2.svg

The hazard ratio or inverse Mills' ratio (IMR) is notated as λ here. Sometimes λ evaluated for a given bXi is notated as λi.

Provided that the sample is censored (i.e., not truncated), it should be possible to estimate λi using a probit model. This reveals that selection bias seen in the initial model can be treated as omitted variable bias, and can be corrected by using the model Yi = bXi + σλi + Vi.

Bivariate

Suppose the variable of interest is unobserved if a second variable is less than zero, and suppose that these are drawn from a joint normal distribution. In other words, the model is specified as:

  • Y1i = bXi + U1i

  • Y2i = γZi + U2i

    • Xi and Zi can be the same, but often the system is only solvable when Zi has more predictors.

Following the same procedures above, it can be demonstrated that:

expectation3.svg

expectation4.svg

where λi is specifically shorthand for λ evaluated for a given γZi/√σ2,2.

Adding these omitted variables leads to a model specified as:

  • Y1i = bXi + (σ1,2/√σ2,2i + V1i

  • Y2i = γZi + (σ2,2/√σ2,2i + V2i

  • E[V22] = σ2,2(1 + φiλi - λi2)

    • as φi goes to infinity (i.e., the chance of selection approaches 100%), this term approaches 0.

  • E[V1V2] = σ1,2(1 + φiλi - λi2)

    • as φi goes to infinity, this term approaches 0.

  • E[V12] = σ1,1[(1 - ρ2) + ρ2(1 + φiλi - λi2)]

    • as φi goes to infinity, this term approaches σ1,1(1 - ρ2).

where φi is shorthand for φ evaluated for a given γZi/√σ2,2; and ρ = σ1,2/√(σ1,1σ2,2).


CategoryRicottone

Statistics/CensoredAndTruncatedRegressionModels (last edited 2026-02-17 15:27:04 by DominicRicottone)