Stata Mlogit
-mlogit- fits a multinomial logit model.
Contents
Usage
When the dependent variable is categorical rather than binary, -mlogit- should be used instead of -logit-. The two are otherwise very similar.
The key is to recognize whether mlogit or -ologit- is more appropriate. Even when there is a natural ordering to the categories, -ologit- may not be a superior model. As an example, adapted from https://www.statalist.org/forums/forum/general-stata-discussion/general/1653984-ordinal-or-multinomial-regression?p=1654012#post1654012:
. use https://www3.nd.edu/~rwilliam/statafiles/mroz.dta
. gen lfstatus = cond(hours==0, 0, cond(inrange(hours,1,1249), 1, 2))
. label define lfstatus 0 "non-participation" 1 "part-time work" 2 "full-time work"
. label values lfstatus lfstatus
. mlogit lfstatus kidslt6 kidsge6 age educ exper nwifeinc
Iteration 0: log likelihood = -809.85106
Iteration 1: log likelihood = -682.09452
Iteration 2: log likelihood = -676.45369
Iteration 3: log likelihood = -676.35678
Iteration 4: log likelihood = -676.35676
Multinomial logistic regression Number of obs = 753
LR chi2(12) = 266.99
Prob > chi2 = 0.0000
Log likelihood = -676.35676 Pseudo R2 = 0.1648
-----------------------------------------------------------------------------------
lfstatus | Coefficient Std. err. z P>|z| [95% conf. interval]
------------------+----------------------------------------------------------------
non_participation | (base outcome)
------------------+----------------------------------------------------------------
part_time_work |
kidslt6 | -1.029752 .2192135 -4.70 0.000 -1.459402 -.6001012
kidsge6 | .1452962 .0810486 1.79 0.073 -.0135561 .3041485
age | -.061935 .0161806 -3.83 0.000 -.0936485 -.0302215
educ | .2352844 .0489108 4.81 0.000 .139421 .3311478
exper | .0836159 .0155026 5.39 0.000 .0532314 .1140004
nwifeinc | -.0191471 .0093588 -2.05 0.041 -.0374899 -.0008043
_cons | -1.051627 .9599877 -1.10 0.273 -2.933168 .8299147
------------------+----------------------------------------------------------------
full_time_work |
kidslt6 | -2.04806 .2883306 -7.10 0.000 -2.613177 -1.482942
kidsge6 | -.0562924 .089552 -0.63 0.530 -.2318111 .1192262
age | -.1267562 .0173295 -7.31 0.000 -.1607214 -.0927911
educ | .2225451 .0508362 4.38 0.000 .1229081 .3221822
exper | .1554865 .0162169 9.59 0.000 .123702 .187271
nwifeinc | -.0218055 .0102905 -2.12 0.034 -.0419746 -.0016364
_cons | 1.552764 .9850566 1.58 0.115 -.3779109 3.48344
-----------------------------------------------------------------------------------
. estimates store mlogit
. ologit lfstatus kidslt6 kidsge6 age educ exper nwifeinc
Iteration 0: log likelihood = -809.85106
Iteration 1: log likelihood = -686.68524
Iteration 2: log likelihood = -685.50088
Iteration 3: log likelihood = -685.49686
Iteration 4: log likelihood = -685.49686
Ordered logistic regression Number of obs = 753
LR chi2(6) = 248.71
Prob > chi2 = 0.0000
Log likelihood = -685.49686 Pseudo R2 = 0.1536
------------------------------------------------------------------------------
lfstatus | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
kidslt6 | -1.390614 .1813509 -7.67 0.000 -1.746055 -1.035172
kidsge6 | -.0341089 .0623172 -0.55 0.584 -.1562484 .0880307
age | -.0916151 .0121459 -7.54 0.000 -.1154207 -.0678096
educ | .158401 .0356408 4.44 0.000 .0885464 .2282557
exper | .1159833 .0112204 10.34 0.000 .0939917 .137975
nwifeinc | -.0153582 .0073408 -2.09 0.036 -.0297459 -.0009705
-------------+----------------------------------------------------------------
/cut1 | -1.75244 .7084357 -3.140949 -.3639319
/cut2 | -.3338748 .7054785 -1.716587 1.048838
------------------------------------------------------------------------------
. estimates store ologit
. lrtest mlogit ologit, force
Likelihood-ratio test
Assumption: ologit nested within mlogit
LR chi2(6) = 18.28
Prob > chi2 = 0.0056The model with fewer constraints, more free parameters, fewer degrees of freedom is the simplified and nested model. The model with more constraints, more estimated parameters, greater degrees of freedom is the full model. If there is not a significant difference between two such models, then the simplified model is preferred.
The null hypothesis is formulated such that the simplified (i.e, -ologit-) model is true, and the likelihood ratio chi-squared test statistic is calculated. If the null hypothesis is rejected, as it is above, then simplification of the model is not justified and the more complex (i.e., -mlogit-) model is preferred.
See also
Stata manual for mlogit post-estimation
