Differences between revisions 6 and 14 (spanning 8 versions)

Econometrics Notation

Observations and Measurements

The number of observations is n.

The outcome variable is y. The outcome measurement for observation i is y_i.

If there is a single predictor, it may be specified as x; the measurement is x_i. More commonly, there is a set of predictors specified like x₁, x₂, and so on. The measurements are then x_1i, x_2i, and so on.

When expressing data with linear algebra, the outcome measurements are composed into vector y with size n, and the predictor measurements are composed into matrix X of shape n by p.

A very common exception: income is usually represented by Y or y. In relevant literature, expect to see different letters.

Error Terms

Error terms are variably represented by ε, e, u, or v. The error term for observation i would be represented like ε_i.

Statistics

There is a mixture of notations for scalar statistics. The conventional estimators for population mean μ, variance σ², standard deviation σ, covariance σ_xy, and correlation ρ_xy are:

Frequently for multiple variable statistics, some pieces of linear algebra notation are introduced. For example, covariances are frequently expressed in a covariance matrix. Covariances of x and y are specified as σ_xy; variances are expressed as covariances of x and x.

Distributions

The normal distribution is commonly used in econometrics, and a shorthand notation has emerged as x_i ~ N(μ, σ).

For multiple variables, at minimum the distribution is specified as NI to emphasize independence of the distributions. Some pieces of linear algebra notation are also introduced. For example, the joint statement of exogeneity and homoskedasticity is:

Note how the covariance matrix is fully expressed as the diagonal matrix of each term's variance.

Modeling

A univariate model is specified with a constant term α and a coefficient term β. A multivariate model of j variables specifies constant β₀ and coefficients β₁ through β_j. A linear algebra notation uses a coefficient vector β of size p.

In any case, when a model is estimated, the estimated coefficients are notated differently. Scalar notations attach a hat, as in βˆ₀. The linear algebra notation replaces β with b.

The predicted outcome from a model is also marked as an estimate by attaching a hat: yˆ.

The generic calculation of the residual for observation i is y_i - yˆ_i. The sum of square residuals (SSR) is what is minimized to fit a model.

And the coefficient of determination is:

CategoryRicottone

Statistics/EconometricsNotation (last edited 2025-01-10 14:15:50 by DominicRicottone)

-  ⇤ ← Revision 6 as of 2023-10-28 05:32:36 → 
  Size: 1022
  Editor: DominicRicottone
  Comment: Estimates and residuals
+   ← Revision 14 as of 2025-01-10 14:15:50 → ⇥
  Size: 3340
  Editor: DominicRicottone
  Comment: Killing Econometrics page
-Deletions are marked like this.
+Additions are marked like this.
 Line 5:
-== Data ==
+== Observations and Measurements ==
 Line 9:
-The outcome variable is ''y''. For observation ''i'', the outcome value is ''y,,i,,''.
+The outcome variable is ''y''. The outcome measurement for observation ''i'' is ''y,,i,,''.
 Line 11:
-The treatment variable is ''x,,1,,''. For observation ''i'', the treatment value is ''x,,1i,,''.
+If there is a single predictor, it may be specified as ''x''; the measurement is ''x,,i,,''. More commonly, there is a set of predictors specified like ''x,,1,,'', ''x,,2,,'', and so on. The measurements are then ''x,,1i,,'', ''x,,2i,,'', and so on.
 Line 13:
-The control variables are ''x,,2,,'' through ''x,,k,,'' (up to ''k'' - 1 control variables). For observation ''i'', a control value might be ''x,,2i,,''.
+When expressing data with [[LinearAlgebra|linear algebra]], the outcome measurements are composed into vector ''y'' with size ''n'', and the predictor measurements are composed into matrix '''''X''''' of shape ''n'' by ''p''.

A very common exception: income is usually represented by ''Y'' or ''y''. In relevant literature, expect to see different letters.



== Error Terms ==

Error terms are variably represented by ''ε'', ''e'', ''u'', or ''v''. The error term for observation ''i'' would be represented like ''ε,,i,,''.
-Line 19:
+Line 27:
-The average outcome is:
+There is a mixture of notations for scalar statistics. The conventional estimators for population mean ''μ'', variance ''σ^2^'', standard deviation ''σ'', covariance ''σ,,xy,,'', and correlation ''ρ,,xy,,'' are:
-Line 23:
+Line 31:
-The variance is:
-Line 26:
+Line 32:
-The standard deviation is:
-Line 31:
+Line 35:
-The covariance between the treatment and outcome is:
-Line 34:
+Line 36:
-The correlation between the treatment and outcome is:
 Line 39:
-Based on [[Econometrics/LinearRegression|regression]], the estimated outcome for observation ''i'' is:
+Frequently for multiple variable statistics, some pieces of [[LinearAlgebra|linear algebra]] notation are introduced. For example, covariances are frequently expressed in a covariance matrix. Covariances of ''x'' and ''y'' are specified as ''σ,,xy,,''; variances are expressed as covariances of ''x'' and ''x''.
 Line 41:
-{{attachment:estimate.svg}}
+{{attachment:covariancem.svg}}
 Line 43:
-And the residual is:
-Line 45:
+Line 44:
-{{attachment:residual.svg}}
+== Distributions ==

The [[Statistics/NormalDistribution|normal distribution]] is commonly used in econometrics, and a shorthand notation has emerged as ''x,,i,, ~ N(μ, σ)''.

For multiple variables, at minimum the distribution is specified as ''NI'' to emphasize independence of the distributions. Some pieces of [[LinearAlgebra|linear algebra]] notation are also introduced. For example, the joint statement of [[Statistics/Exogeneity|exogeneity]] and [[Statistics/Homoskedasticity|homoskedasticity]] is:

{{attachment:exo.svg}}

Note how the covariance matrix is fully expressed as the [[LinearAlgebra/SpecialMatrices#Diagonal_Matrices|diagonal matrix]] of each term's variance.



== Modeling ==

A univariate model is specified with a constant term ''α'' and a coefficient term ''β''. A multivariate model of ''j'' variables specifies constant ''β,,0,,'' and coefficients ''β,,1,,'' through ''β,,j,,''. A [[LinearAlgebra|linear algebra]] notation uses a coefficient vector ''β'' of size ''p''.

In any case, when a model is estimated, the estimated coefficients are notated differently. Scalar notations attach a hat, as in ''βˆ,,0,,''. The linear algebra notation replaces ''β'' with ''b''.

The predicted outcome from a model is also marked as an estimate by attaching a hat: ''yˆ''.

The generic calculation of the residual for observation ''i'' is ''y,,i,, - yˆ,,i,,''. The sum of square residuals (SSR) is what is minimized to fit a model.

And the coefficient of determination is:

{{attachment:rsquared.svg}}