Differences between revisions 3 and 10 (spanning 7 versions)
Revision 3 as of 2023-01-10 22:09:25
Size: 2235
Comment:
Revision 10 as of 2023-06-09 16:51:50
Size: 1282
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Stata Aggregating Data = = Aggregating Data with Stata =

Stata offers several commands for computing aggregated statistics and translating datasets into aggregated formats.
Line 9: Line 11:
== Contract == == Statistics ==
Line 11: Line 13:
The '''`collapse`''' command is ideal for aggregated statistics. The [[Stata/Summarize|summarize]] command computes and [[Stata/StoredResults|stores]] descriptive statistics.

The [[Stata/Inspect|inspect]] command is useful for interactive exploration.

The [[Stata/Contract|contract]] and [[Stata/Collapse|collapse]] create datasets of aggregated statistics. The former is useful for descriptive statistics, while the latter is designed for summary statistics.
Line 14: Line 20:
contract VAR, freq(Count) percent(Percentage) contract foo, freq(Count) percent(Percentage)
Line 21: Line 27:
== Collapse == == Wide and Long Data ==
Line 23: Line 29:
---- The [[Stata/Reshape|reshape]] command can be used to translate datasets between wide and long formats.
Line 25: Line 31:


== Reshape ==

The '''`reshape`''' command can be used to move data between wide and tall formats. It has additional, helpful features, such as a diagnostic `reshape errors` command.



=== Reshape Wide ===

To expand long data into a wide format, try:
To translate into a wide format, try:
Line 43: Line 39:
If the group indicator should be placed anywhere other than as a suffix to `VARSTUB`, use a single at siogn (`@`) to indicate the placement. For example, if the groups are `1`, `2`, and `3`, then the command `reshape wide inc@r, i(KEYVAR) j(GROUPVAR)` would create variable `inc1r`, `inc2r`, and `inc3r`.

If `GROUPVAR` is a string variable, the `string` option is mandatory.

If data has been transformed through a `reshape wide` command like above, then to restore data to the long format, try:

{{{
reshape long
}}}

The parameters of the transformation are stored and reused between calls.



=== Reshape Long ===

To contract wide data into a long format, try:
To translate into a long format, try:
Line 65: Line 45:
...`VARSTUB` will be created from the variable list `VARSTUB*`, and `GROUPVAR` will be created to indicate the source of `VARSTUB`. If the `string` option is specified, `GROUPVAR` will be a string variable.

If the variable list does not follow the simple pattern of `VARSTUB*`, it may be possible to specify with an at sign (`@`). For example, if the target variables are `inc1r`, `inc2r`, and `inc3r` and the intended `GROUPVAR` groups are `1`, `2`, and `3`, then the command `reshape long inc@r, i(KEYVAR) j(GROUPVAR)` would correctly create the variable `incr`.

If data has been transformed through a `reshape long` command like above, then to restore data to the wide format, try:

{{{
reshape wide
}}}

The parameters of the transformation are stored and reused between calls.
The variable `VARSTUB` will be created from the variable list `VARSTUB*`, and `GROUPVAR` will be created to indicate the source of `VARSTUB`.

Aggregating Data with Stata

Stata offers several commands for computing aggregated statistics and translating datasets into aggregated formats.


Statistics

The summarize command computes and stores descriptive statistics.

The inspect command is useful for interactive exploration.

The contract and collapse create datasets of aggregated statistics. The former is useful for descriptive statistics, while the latter is designed for summary statistics.

contract foo, freq(Count) percent(Percentage)


Wide and Long Data

The reshape command can be used to translate datasets between wide and long formats.

To translate into a wide format, try:

reshape wide VARSTUB, i(KEYVAR) j(GROUPVAR)

A series of variables named like VARSTUB* will be created, for each group of GROUPVAR.

To translate into a long format, try:

reshape long VARSTUB, i(KEYVAR) j(GROUPVAR)

The variable VARSTUB will be created from the variable list VARSTUB*, and GROUPVAR will be created to indicate the source of VARSTUB.


CategoryRicottone

Stata/AggregatingData (last edited 2025-10-24 16:26:08 by DominicRicottone)