= Stata Logit = The '''`logit`''' command runs a logistic regression. <> ---- == Usage == Copied from [[https://stats.idre.ucla.edu/stata/dae/logistic-regression/|here]]: {{{ . use https://stats.idre.ucla.edu/stat/stata/dae/binary, clear . logit admit gre gpa i.rank Iteration 0: log likelihood = -249.98826 Iteration 1: log likelihood = -229.66446 Iteration 2: log likelihood = -229.25955 Iteration 3: log likelihood = -229.25875 Iteration 4: log likelihood = -229.25875 Logistic regression Number of obs = 400 LR chi2(5) = 41.46 Prob > chi2 = 0.0000 Log likelihood = -229.25875 Pseudo R2 = 0.0829 ------------------------------------------------------------------------------ admit | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gre | .0022644 .001094 2.07 0.038 .0001202 .0044086 gpa | .8040377 .3318193 2.42 0.015 .1536838 1.454392 | rank | 2 | -.6754429 .3164897 -2.13 0.033 -1.295751 -.0551346 3 | -1.340204 .3453064 -3.88 0.000 -2.016992 -.6634158 4 | -1.551464 .4178316 -3.71 0.000 -2.370399 -.7325287 | _cons | -3.989979 1.139951 -3.50 0.000 -6.224242 -1.755717 ------------------------------------------------------------------------------ }}} Compare the output of [[Stata/Logistic|logistic]], which always shows the odds ratios, while the `or` option must be specified on `logit` to show those. See [[Stata/Regress#Factor_Variables|here]] for details on factor variables. ---- == Estimates == The estimates can be accessed through any of the following commands... * `predict` creates a variable storing the predicted probability for each case * `margins` displays the marginal predicted probabilities ---- == Tips == === One-way causation === If a variable predicts failure or success perfectly, the model cannot be fit with it. Stata's default solution is to omit that variable and any cases with that problematic data pattern. {{{ . use https://www.stata-press.com/data/r18/repair, clear . logit foreign b3.repair note: 1.repair != 0 predicts failure perfectly; 1.repair omitted and 10 obs not used. Iteration 0: Log likelihood = -26.992087 Iteration 1: Log likelihood = -22.483187 Iteration 2: Log likelihood = -22.230498 Iteration 3: Log likelihood = -22.229139 Iteration 4: Log likelihood = -22.229138 Logistic regression Number of obs = 48 LR chi2(1) = 9.53 Prob > chi2 = 0.0020 Log likelihood = -22.229138 Pseudo R2 = 0.1765 ------------------------------------------------------------------------------- foreign | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+----------------------------------------------------------------- repair | 1 | 0 (empty) 2 | -2.197225 .7698003 -2.85 0.004 -3.706005 -.6884436 | _cons | -1.85e-17 .4714045 -0.00 1.000 -.9239359 .9239359 ------------------------------------------------------------------------------- }}} === Two-way causation === Similarly, if a variable predicts both failure and success perfectly, the model cannot be fit with it. Stata does not have a default solution and will stop execution. === Completely determined === Consider this example: {{{ . use https://www.stata-press.com/data/r18/auto, clear (1978 Automobile Data) . drop if foreign == 0 & gear_ratio > 3.1 (6 observations deleted) . logit foreign mpg weight gear_ratio Iteration 0: log likelihood = -42.806086 Iteration 1: log likelihood = -17.438677 Iteration 2: log likelihood = -11.209232 Iteration 3: log likelihood = -8.2749141 Iteration 4: log likelihood = -7.0018452 Iteration 5: log likelihood = -6.5795946 Iteration 6: log likelihood = -6.4944116 Iteration 7: log likelihood = -6.4875497 Iteration 8: log likelihood = -6.4874814 Iteration 9: log likelihood = -6.4874814 Logistic regression Number of obs = 68 LR chi2(3) = 72.64 Prob > chi2 = 0.0000 Log likelihood = -6.4874814 Pseudo R2 = 0.8484 ------------------------------------------------------------------------------ foreign | Coefficient Std. err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -.4944907 .2655508 -1.86 0.063 -1.014961 .0259792 weight | -.0060919 .003101 -1.96 0.049 -.0121698 -.000014 gear_ratio | 15.70509 8.166234 1.92 0.054 -.3004359 31.71061 _cons | -21.39527 25.41486 -0.84 0.400 -71.20747 28.41694 ------------------------------------------------------------------------------ note: 4 failures and 0 successes completely determined. }}} In this case, the warning means that the continuous variable (i.e., `gear_ratio`) predicts the outcome very well. This is also hinted at with the extremely large coefficienton that term in the fit model. If a standard error is omitted, the warning would instead suggest colinearity. This generally only happens with indicator terms created from interactions of categorical variables. The problematic term should be removed. ---- == See also == [[https://www.stata.com/manuals/rlogitpostestimation.pdf|Stata manual for logit post-estimation]] ---- CategoryRicottone