= Stata Collapse = The '''`collapse`''' command creates a dataset of summary statistics. <> ---- == Usage == Given a dataset like: ||'''id'''||'''foo'''||'''bar'''||'''baz'''|| ||1 ||0 ||0 ||10 || ||2 ||1 ||0 ||20 || ||3 ||0 ||0 ||10 || ||4 ||1 ||0 ||20 || ||5 ||0 ||1 ||10 || ||6 ||1 ||1 ||20 || ||7 ||0 ||1 ||10 || ||8 ||1 ||1 ||20 || Running `collapse (percent) foo (mean) baz, by(bar)` would create a dataset like: ||'''bar'''||'''foo'''||'''baz'''|| ||0 ||50 ||15 || ||1 ||50 ||15 || ---- == Statistics == The `collapse` command is capable of computing any of: * `count`: count of non-missing values * `first` * `firstnm`: first non-missing * `iqr` * `last` * `lastnm`: last non-missing * `max` * `mean` * `median` * `min` * `pN` where `N` is any integer 1 through 99: `N`th percentile * `p50`: alias for `median` * `percent`: percent of non-missing values * `rawsum`: sum ''ignoring'' weight factors ''except'' when the weight factor is 0 * `sd` * `sebinomial`: standard error as `sd/sqrt(n)` * `semean`: standard error as `sqrt(p(1-p)/n)` * `sepoisson`: standard error as `sqrt(mean/n)` * `sum` The default statistic to compute is `mean`. ---- == See also == [[https://www.stata.com/manuals/dcollapse.pdf|Stata manual for collapse]] ---- CategoryRicottone