Size: 1639
Comment:
|
← Revision 3 as of 2023-06-08 01:04:54 ⇥
Size: 1642
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 67: | Line 67: |
[[https://www.stata.com/manuals/dcollapse.pdf|Stata manual for order]] | [[https://www.stata.com/manuals/dcollapse.pdf|Stata manual for collapse]] |
Stata Collapse
The collapse command creates a dataset of summary statistics.
Contents
Usage
Given a dataset like:
id |
foo |
bar |
baz |
1 |
0 |
0 |
10 |
2 |
1 |
0 |
20 |
3 |
0 |
0 |
10 |
4 |
1 |
0 |
20 |
5 |
0 |
1 |
10 |
6 |
1 |
1 |
20 |
7 |
0 |
1 |
10 |
8 |
1 |
1 |
20 |
Running collapse (percent) foo (mean) baz, by(bar) would create a dataset like:
bar |
foo |
baz |
0 |
50 |
15 |
1 |
50 |
15 |
Statistics
The collapse command is capable of computing any of:
count: count of non-missing values
first
firstnm: first non-missing
iqr
last
lastnm: last non-missing
max
mean
median
min
pN where N is any integer 1 through 99: Nth percentile
p50: alias for median
percent: percent of non-missing values
rawsum: sum ignoring weight factors except when the weight factor is 0
sd
sebinomial: standard error as sd/sqrt(n)
semean: standard error as sqrt(p(1-p)/n)
sepoisson: standard error as sqrt(mean/n)
sum
The default statistic to compute is mean.