= Stata Stepwise = The '''`stepwise`''' command runs a model iteratively with stepwise removal (or addition) of terms. <> ---- == Usage == {{{ use http://stata-press.com/data/r14/auto, clear generate weight2 = weight * weight stepwise, pr(.2): regress mpg weight weight2 (displ gear) turn headroom foreign price }}} This is equivalent to interactively running: {{{ // Full model regress mpg weight weight2 (displ gear) turn headroom foreign price // Observe that the `headroom` parameter has the greatest non-significant (>=0.2) p-value, so remove it regress mpg weight weight2 (displ gear) turn foreign price // ... remove the `(displ gear)` parameter regress mpg weight weight2 turn foreign price // ... remove the `price` parameter regress mpg weight weight2 turn foreign // Observe that all remaining parameters are significant }}} Implicitly this method uses [[Econometrics/WaldTest|Wald tests]]. ---- == Syntax == After using `stepwise`, calling the command again (or the underlying modeling command) without any arguments reproduces the stepwise estimation results. The `stepwise` command is aliased to `sw`. === Terms === Terms on a modeling command are commonly variable names. They can also be [[Stata/Regress#Factor_Variables|factor variables]]. Terms are considered with respect to parentheses. For example, in this model: {{{ stepwise, pr(.2): regress y x1 x2 x3 x4 i.a }}} ...each factor variable of `a` is considered separately. Alternatively, in this model: {{{ stepwise, pr(.2): regress y x1 x2 x3 x4 (i.a) }}} ...the factor variables of `a` are considered altogether. === Options === The `pr()` option specifies a signficance level over which parameters are stepwise removed. This mode is called '''backward selection'''. Compare to the '''`pe()`''' option, which specifies a significant level at which parameters are stepwise added. This mode is called '''forward selection'''. `pr()` and `pe()` can be used simultaneously. At first the model is fit with backward selection. Then excluded terms are re-examined for re-addition. Then included terms are re-examined for re-removal. This is repeated until all included parameters are significant and all excluded parameters are non-significant. Because of how equivalence is treated by the significance tests, it can be necessary to combine these options with unusual numbers, like: {{{ stepwise, pr(0.050001) pe(0.05): regress mpg weight weight2 (displ gear) turn headroom foreign price }}} The '''`forward`''' is only effective when using the `pr()` and `pe()` options simultaneously. At first the model is fit with forward selection. Then included terms are re-examined for re-removal. Then excluded terms are re-examined for re-addition. The '''`hierarchical`''' option directs `stepwise` to consider parameters in order. Given a model fit on `x1`, `x2`, and `x3` and a backward selection mode: `x3` is the first parameter considered for removal regardless of how its p-value compares to other parameters'. (This can be called '''backward hierarchical selection'''.) Instead fiven a forward selection mode: `x1` is the first parameter considered for addition. (This can be called '''forward hierarchical selection'''.) The '''`lockterm`''' option locks the first independent variable into the model. For example, to lock the parameter for `x1`, try: {{{ stepwise, pr(0.2) lockterm1: logistic y x1 x2 x3 }}} This option respects parentheses. To lock the parameters for `x1` and `x2`, try: {{{ stepwise, pr(0.2) lockterm1: logistic y (x1 x2) x3 }}} Note that some modeling commands do not take a dependent variable, and some take more than one dependent variable, so it is misleading to assume that `lockterm1` locks the second specified term. The '''`lr`''' option specifies the [[Econometrics/LikelihoodRatioTest|likelihood ratio test]] instead of the [[Econometrics/WaldTest|Wald test]]. ---- == See also == [[https://www.stata.com/manuals/rstepwise.pdf|Stata manual for stepwise]] ---- CategoryRicottone