|
Size: 2338
Comment:
|
Size: 6343
Comment: Standardize
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 1: | Line 1: |
| = Stata Logit = | = Stata logit = |
| Line 3: | Line 3: |
| The '''`logit`''' command runs a logistic regression. Compare to the [[Stata/Logistic|logistic]] command, which always shows the odds ratios, while the `or` option must be specified on `logit` to show those. |
The '''`logit`''' command fits a [[Statistics/LogisticModel|logit model]]. |
| Line 15: | Line 13: |
| This example is per [[https://stats.idre.ucla.edu/stata/dae/logistic-regression/|UCLA: Statistical Consulting Group]]: | Copied from [[https://stats.idre.ucla.edu/stata/dae/logistic-regression/|here]]: |
| Line 18: | Line 16: |
| . logit admit gre gpa i.rank | . use https://stats.idre.ucla.edu/stat/stata/dae/binary, clear |
| Line 20: | Line 18: |
| Iteration 0: log likelihood = -249.98826 Iteration 1: log likelihood = -229.66446 Iteration 2: log likelihood = -229.25955 Iteration 3: log likelihood = -229.25875 Iteration 4: log likelihood = -229.25875 |
. logit admit gre gpa i.rank Iteration 0: log likelihood = -249.98826 Iteration 1: log likelihood = -229.66446 Iteration 2: log likelihood = -229.25955 Iteration 3: log likelihood = -229.25875 Iteration 4: log likelihood = -229.25875 |
| Line 46: | Line 46: |
| Compare the output of [[Stata/Logistic|logistic]], which always shows the odds ratios, while the `or` option must be specified on `logit` to show those. |
|
| Line 54: | Line 56: |
| The estimates can be accessed through any of the following commands... | The model can then be used by the following post-estimation commands: |
| Line 62: | Line 65: |
| == Tips == === One-way causation === If a variable predicts failure or success perfectly, the model cannot be fit with it. Stata's default solution is to omit that variable and any cases with that problematic data pattern. {{{ . use https://www.stata-press.com/data/r18/repair, clear . logit foreign b3.repair note: 1.repair != 0 predicts failure perfectly; 1.repair omitted and 10 obs not used. Iteration 0: Log likelihood = -26.992087 Iteration 1: Log likelihood = -22.483187 Iteration 2: Log likelihood = -22.230498 Iteration 3: Log likelihood = -22.229139 Iteration 4: Log likelihood = -22.229138 Logistic regression Number of obs = 48 LR chi2(1) = 9.53 Prob > chi2 = 0.0020 Log likelihood = -22.229138 Pseudo R2 = 0.1765 ------------------------------------------------------------------------------- foreign | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+----------------------------------------------------------------- repair | 1 | 0 (empty) 2 | -2.197225 .7698003 -2.85 0.004 -3.706005 -.6884436 | _cons | -1.85e-17 .4714045 -0.00 1.000 -.9239359 .9239359 ------------------------------------------------------------------------------- }}} === Two-way causation === Similarly, if a variable predicts both failure and success perfectly, the model cannot be fit with it. Stata does not have a default solution and will stop execution. === Completely determined === Consider this example: {{{ . use https://www.stata-press.com/data/r18/auto, clear (1978 Automobile Data) . drop if foreign == 0 & gear_ratio > 3.1 (6 observations deleted) . logit foreign mpg weight gear_ratio Iteration 0: log likelihood = -42.806086 Iteration 1: log likelihood = -17.438677 Iteration 2: log likelihood = -11.209232 Iteration 3: log likelihood = -8.2749141 Iteration 4: log likelihood = -7.0018452 Iteration 5: log likelihood = -6.5795946 Iteration 6: log likelihood = -6.4944116 Iteration 7: log likelihood = -6.4875497 Iteration 8: log likelihood = -6.4874814 Iteration 9: log likelihood = -6.4874814 Logistic regression Number of obs = 68 LR chi2(3) = 72.64 Prob > chi2 = 0.0000 Log likelihood = -6.4874814 Pseudo R2 = 0.8484 ------------------------------------------------------------------------------ foreign | Coefficient Std. err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -.4944907 .2655508 -1.86 0.063 -1.014961 .0259792 weight | -.0060919 .003101 -1.96 0.049 -.0121698 -.000014 gear_ratio | 15.70509 8.166234 1.92 0.054 -.3004359 31.71061 _cons | -21.39527 25.41486 -0.84 0.400 -71.20747 28.41694 ------------------------------------------------------------------------------ note: 4 failures and 0 successes completely determined. }}} In this case, the warning means that the continuous variable (i.e., `gear_ratio`) predicts the outcome very well. This is also hinted at with the extremely large coefficienton that term in the fit model. If a standard error is omitted, the warning would instead suggest colinearity. This generally only happens with indicator terms created from interactions of categorical variables. The problematic term should be removed. ---- |
|
| Line 63: | Line 158: |
[[https://www.stata.com/manuals/rlogit.pdf|Stata manual for logit]] |
Stata logit
The logit command fits a logit model.
Contents
Usage
Copied from here:
. use https://stats.idre.ucla.edu/stat/stata/dae/binary, clear
. logit admit gre gpa i.rank
Iteration 0: log likelihood = -249.98826
Iteration 1: log likelihood = -229.66446
Iteration 2: log likelihood = -229.25955
Iteration 3: log likelihood = -229.25875
Iteration 4: log likelihood = -229.25875
Logistic regression Number of obs = 400
LR chi2(5) = 41.46
Prob > chi2 = 0.0000
Log likelihood = -229.25875 Pseudo R2 = 0.0829
------------------------------------------------------------------------------
admit | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gre | .0022644 .001094 2.07 0.038 .0001202 .0044086
gpa | .8040377 .3318193 2.42 0.015 .1536838 1.454392
|
rank |
2 | -.6754429 .3164897 -2.13 0.033 -1.295751 -.0551346
3 | -1.340204 .3453064 -3.88 0.000 -2.016992 -.6634158
4 | -1.551464 .4178316 -3.71 0.000 -2.370399 -.7325287
|
_cons | -3.989979 1.139951 -3.50 0.000 -6.224242 -1.755717
------------------------------------------------------------------------------Compare the output of logistic, which always shows the odds ratios, while the or option must be specified on logit to show those.
See here for details on factor variables.
Estimates
The model can then be used by the following post-estimation commands:
predict creates a variable storing the predicted probability for each case
margins displays the marginal predicted probabilities
Tips
One-way causation
If a variable predicts failure or success perfectly, the model cannot be fit with it. Stata's default solution is to omit that variable and any cases with that problematic data pattern.
. use https://www.stata-press.com/data/r18/repair, clear
. logit foreign b3.repair
note: 1.repair != 0 predicts failure perfectly;
1.repair omitted and 10 obs not used.
Iteration 0: Log likelihood = -26.992087
Iteration 1: Log likelihood = -22.483187
Iteration 2: Log likelihood = -22.230498
Iteration 3: Log likelihood = -22.229139
Iteration 4: Log likelihood = -22.229138
Logistic regression Number of obs = 48
LR chi2(1) = 9.53
Prob > chi2 = 0.0020
Log likelihood = -22.229138 Pseudo R2 = 0.1765
-------------------------------------------------------------------------------
foreign | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+-----------------------------------------------------------------
repair |
1 | 0 (empty)
2 | -2.197225 .7698003 -2.85 0.004 -3.706005 -.6884436
|
_cons | -1.85e-17 .4714045 -0.00 1.000 -.9239359 .9239359
-------------------------------------------------------------------------------
Two-way causation
Similarly, if a variable predicts both failure and success perfectly, the model cannot be fit with it. Stata does not have a default solution and will stop execution.
Completely determined
Consider this example:
. use https://www.stata-press.com/data/r18/auto, clear
(1978 Automobile Data)
. drop if foreign == 0 & gear_ratio > 3.1
(6 observations deleted)
. logit foreign mpg weight gear_ratio
Iteration 0: log likelihood = -42.806086
Iteration 1: log likelihood = -17.438677
Iteration 2: log likelihood = -11.209232
Iteration 3: log likelihood = -8.2749141
Iteration 4: log likelihood = -7.0018452
Iteration 5: log likelihood = -6.5795946
Iteration 6: log likelihood = -6.4944116
Iteration 7: log likelihood = -6.4875497
Iteration 8: log likelihood = -6.4874814
Iteration 9: log likelihood = -6.4874814
Logistic regression Number of obs = 68
LR chi2(3) = 72.64
Prob > chi2 = 0.0000
Log likelihood = -6.4874814 Pseudo R2 = 0.8484
------------------------------------------------------------------------------
foreign | Coefficient Std. err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | -.4944907 .2655508 -1.86 0.063 -1.014961 .0259792
weight | -.0060919 .003101 -1.96 0.049 -.0121698 -.000014
gear_ratio | 15.70509 8.166234 1.92 0.054 -.3004359 31.71061
_cons | -21.39527 25.41486 -0.84 0.400 -71.20747 28.41694
------------------------------------------------------------------------------
note: 4 failures and 0 successes completely determined.In this case, the warning means that the continuous variable (i.e., gear_ratio) predicts the outcome very well. This is also hinted at with the extremely large coefficienton that term in the fit model.
If a standard error is omitted, the warning would instead suggest colinearity. This generally only happens with indicator terms created from interactions of categorical variables. The problematic term should be removed.
See also
Stata manual for logit post-estimation
