Differences between revisions 3 and 11 (spanning 8 versions)
Revision 3 as of 2023-01-10 22:09:25
Size: 2235
Comment:
Revision 11 as of 2025-10-24 16:26:08
Size: 1301
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Stata Aggregating Data = = Aggregating Data with Stata =

Stata offers several commands for computing aggregated statistics and translating datasets into aggregated formats.
Line 9: Line 11:
== Contract == == Statistics ==
Line 11: Line 13:
The '''`collapse`''' command is ideal for aggregated statistics. The [[Stata/Summarize|-summarize-]] command computes and [[Stata/StoredResults|stores]] descriptive statistics.

The [[Stata/Inspect|-inspect-]] command is useful for interactive exploration.

The [[Stata/Contract|-contract-]] and [[Stata/Collapse|-collapse-]] commands create datasets of aggregated statistics. The former is useful for descriptive statistics, while the latter is designed for summary statistics.
Line 14: Line 20:
contract VAR, freq(Count) percent(Percentage) contract foo, freq(Count) percent(Percentage)
Line 21: Line 27:
== Collapse == == Wide and Long Data ==
Line 23: Line 29:
---- The [[Stata/Reshape|-reshape-]] command can be used to translate datasets between wide and long formats.
Line 25: Line 31:


== Reshape ==

The '''`reshape`''' command can be used to move data between wide and tall formats. It has additional, helpful features, such as a diagnostic `reshape errors` command.



=== Reshape Wide ===

To expand long data into a wide format, try:
To translate into a wide format, try:
Line 43: Line 39:
If the group indicator should be placed anywhere other than as a suffix to `VARSTUB`, use a single at siogn (`@`) to indicate the placement. For example, if the groups are `1`, `2`, and `3`, then the command `reshape wide inc@r, i(KEYVAR) j(GROUPVAR)` would create variable `inc1r`, `inc2r`, and `inc3r`.

If `GROUPVAR` is a string variable, the `string` option is mandatory.

If data has been transformed through a `reshape wide` command like above, then to restore data to the long format, try:

{{{
reshape long
}}}

The parameters of the transformation are stored and reused between calls.



=== Reshape Long ===

To contract wide data into a long format, try:
To translate into a long format, try:
Line 65: Line 45:
...`VARSTUB` will be created from the variable list `VARSTUB*`, and `GROUPVAR` will be created to indicate the source of `VARSTUB`. If the `string` option is specified, `GROUPVAR` will be a string variable.

If the variable list does not follow the simple pattern of `VARSTUB*`, it may be possible to specify with an at sign (`@`). For example, if the target variables are `inc1r`, `inc2r`, and `inc3r` and the intended `GROUPVAR` groups are `1`, `2`, and `3`, then the command `reshape long inc@r, i(KEYVAR) j(GROUPVAR)` would correctly create the variable `incr`.

If data has been transformed through a `reshape long` command like above, then to restore data to the wide format, try:

{{{
reshape wide
}}}

The parameters of the transformation are stored and reused between calls.
The variable `VARSTUB` will be created from the variable list `VARSTUB*`, and `GROUPVAR` will be created to indicate the source of `VARSTUB`.

Aggregating Data with Stata

Stata offers several commands for computing aggregated statistics and translating datasets into aggregated formats.


Statistics

The -summarize- command computes and stores descriptive statistics.

The -inspect- command is useful for interactive exploration.

The -contract- and -collapse- commands create datasets of aggregated statistics. The former is useful for descriptive statistics, while the latter is designed for summary statistics.

contract foo, freq(Count) percent(Percentage)


Wide and Long Data

The -reshape- command can be used to translate datasets between wide and long formats.

To translate into a wide format, try:

reshape wide VARSTUB, i(KEYVAR) j(GROUPVAR)

A series of variables named like VARSTUB* will be created, for each group of GROUPVAR.

To translate into a long format, try:

reshape long VARSTUB, i(KEYVAR) j(GROUPVAR)

The variable VARSTUB will be created from the variable list VARSTUB*, and GROUPVAR will be created to indicate the source of VARSTUB.


CategoryRicottone

Stata/AggregatingData (last edited 2025-10-24 16:26:08 by DominicRicottone)