Stata Logit
The logit command runs a logistic regression.
Contents
Usage
Copied from here:
. use https://stats.idre.ucla.edu/stat/stata/dae/binary, clear . logit admit gre gpa i.rank Iteration 0: log likelihood = -249.98826 Iteration 1: log likelihood = -229.66446 Iteration 2: log likelihood = -229.25955 Iteration 3: log likelihood = -229.25875 Iteration 4: log likelihood = -229.25875 Logistic regression Number of obs = 400 LR chi2(5) = 41.46 Prob > chi2 = 0.0000 Log likelihood = -229.25875 Pseudo R2 = 0.0829 ------------------------------------------------------------------------------ admit | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gre | .0022644 .001094 2.07 0.038 .0001202 .0044086 gpa | .8040377 .3318193 2.42 0.015 .1536838 1.454392 | rank | 2 | -.6754429 .3164897 -2.13 0.033 -1.295751 -.0551346 3 | -1.340204 .3453064 -3.88 0.000 -2.016992 -.6634158 4 | -1.551464 .4178316 -3.71 0.000 -2.370399 -.7325287 | _cons | -3.989979 1.139951 -3.50 0.000 -6.224242 -1.755717 ------------------------------------------------------------------------------
Compare the output of logistic, which always shows the odds ratios, while the or option must be specified on logit to show those.
See here for details on factor variables.
Estimates
The estimates can be accessed through any of the following commands...
predict creates a variable storing the predicted probability for each case
margins displays the marginal predicted probabilities
Tips
One-way causation
If a variable predicts failure or success perfectly, the model cannot be fit with it. Stata's default solution is to omit that variable and any cases with that problematic data pattern.
. use https://www.stata-press.com/data/r18/repair, clear . logit foreign b3.repair note: 1.repair != 0 predicts failure perfectly; 1.repair omitted and 10 obs not used. Iteration 0: Log likelihood = -26.992087 Iteration 1: Log likelihood = -22.483187 Iteration 2: Log likelihood = -22.230498 Iteration 3: Log likelihood = -22.229139 Iteration 4: Log likelihood = -22.229138 Logistic regression Number of obs = 48 LR chi2(1) = 9.53 Prob > chi2 = 0.0020 Log likelihood = -22.229138 Pseudo R2 = 0.1765 ------------------------------------------------------------------------------- foreign | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+----------------------------------------------------------------- repair | 1 | 0 (empty) 2 | -2.197225 .7698003 -2.85 0.004 -3.706005 -.6884436 | _cons | -1.85e-17 .4714045 -0.00 1.000 -.9239359 .9239359 -------------------------------------------------------------------------------
Two-way causation
Similarly, if a variable predicts both failure and success perfectly, the model cannot be fit with it. Stata does not have a default solution and will stop execution.
Completely determined
Consider this example:
. use https://www.stata-press.com/data/r18/auto, clear (1978 Automobile Data) . drop if foreign == 0 & gear_ratio > 3.1 (6 observations deleted) . logit foreign mpg weight gear_ratio Iteration 0: log likelihood = -42.806086 Iteration 1: log likelihood = -17.438677 Iteration 2: log likelihood = -11.209232 Iteration 3: log likelihood = -8.2749141 Iteration 4: log likelihood = -7.0018452 Iteration 5: log likelihood = -6.5795946 Iteration 6: log likelihood = -6.4944116 Iteration 7: log likelihood = -6.4875497 Iteration 8: log likelihood = -6.4874814 Iteration 9: log likelihood = -6.4874814 Logistic regression Number of obs = 68 LR chi2(3) = 72.64 Prob > chi2 = 0.0000 Log likelihood = -6.4874814 Pseudo R2 = 0.8484 ------------------------------------------------------------------------------ foreign | Coefficient Std. err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -.4944907 .2655508 -1.86 0.063 -1.014961 .0259792 weight | -.0060919 .003101 -1.96 0.049 -.0121698 -.000014 gear_ratio | 15.70509 8.166234 1.92 0.054 -.3004359 31.71061 _cons | -21.39527 25.41486 -0.84 0.400 -71.20747 28.41694 ------------------------------------------------------------------------------ note: 4 failures and 0 successes completely determined.
In this case, the warning means that the continuous variable (i.e., gear_ratio) predicts the outcome very well. This is also hinted at with the extremely large coefficienton that term in the fit model.
If a standard error is omitted, the warning would instead suggest colinearity. This generally only happens with indicator terms created from interactions of categorical variables. The problematic term should be removed.
See also
Stata manual for logit post-estimation