Stata Collapse
The collapse command creates a dataset of summary statistics.
Contents
Usage
Given a dataset like:
id |
foo |
bar |
baz |
1 |
0 |
0 |
10 |
2 |
1 |
0 |
20 |
3 |
0 |
0 |
10 |
4 |
1 |
0 |
20 |
5 |
0 |
1 |
10 |
6 |
1 |
1 |
20 |
7 |
0 |
1 |
10 |
8 |
1 |
1 |
20 |
Running collapse (percent) foo (mean) baz, by(bar) would create a dataset like:
bar |
foo |
baz |
0 |
50 |
15 |
1 |
50 |
15 |
Statistics
The collapse command is capable of computing any of:
count: count of non-missing values
first
firstnm: first non-missing
iqr
last
lastnm: last non-missing
max
mean
median
min
pN where N is any integer 1 through 99: Nth percentile
p50: alias for median
percent: percent of non-missing values
rawsum: sum ignoring weight factors except when the weight factor is 0
sd
sebinomial: standard error as sd/sqrt(n)
semean: standard error as sqrt(p(1-p)/n)
sepoisson: standard error as sqrt(mean/n)
sum
The default statistic to compute is mean.