Differences between revisions 10 and 19 (spanning 9 versions)

Econometrics Notation

Observations and Measurements

The number of observations is n.

The outcome variable is y. The outcome measurement for observation i is y_i.

If there is a single predictor, it may be specified as x; the measurement is x_i. More commonly, there is a set of predictors specified like x₁, x₂, and so on. The measurements are then x_1i, x_2i, and so on.

When expressing data with linear algebra, the outcome measurements are composed into vector y with size n, and the predictor measurements are composed into matrix X of shape n by p.

A very common exception: income is usually represented by Y or y. In relevant literature, expect to see different letters.

Error Terms

Error terms are variably represented by ε, e, u, or v. The error term for observation i would be represented like ε_i.

Statistics

There is a mixture of notations for scalar statistics. The conventional estimators for population mean μ, variance σ², standard deviation σ, covariance σ_xy, and correlation ρ_xy are:

Frequently for multiple variable statistics, some pieces of linear algebra notation are introduced. For example, covariances are frequently expressed in a covariance matrix. Covariances of x and y are specified as σ_xy; variances are expressed as covariances of x and x.

Distributions

The normal distribution is commonly used in econometrics, and a shorthand notation has emerged as x_i ~ N(μ, σ).

For multiple variables, at minimum the distribution is specified as NI to emphasize independence of the distributions. Some pieces of linear algebra notation are also introduced.

Models

A linear model of k variables specifies constant β₀ and coefficients β₁ through β_k. The linear algebra notation uses a coefficient vector β of size p.

When the model is fit using regression, the estimated coefficients are notated using a hat, as in ˆβ₀. The linear algebra notation uses b instead.

The predicted outcome is also notated using a hat, as in ŷ_i.

The generic calculation of the residual for observation i is y_i - ŷ_i. The residual sum of squares (RSS) is what is minimized to fit a model.

And the coefficient of determination is:

CategoryRicottone

Statistics/EconometricsNotation (last edited 2025-11-03 01:33:40 by DominicRicottone)

-  ⇤ ← Revision 10 as of 2024-06-07 15:14:21 → 
  Size: 2471
  Editor: DominicRicottone
  Comment: Rewrite 2
+   ← Revision 19 as of 2025-11-03 01:33:40 → ⇥
  Size: 3008
  Editor: DominicRicottone
  Comment: Links
-Deletions are marked like this.
+Additions are marked like this.
 Line 27:
-== Distributions ==

The [[Statistics/NormalDistribution|normal distribution]] is frequently expressed in econometrics. The typical notation is ''x,,i,, ~ N(μ, σ)''.

For multiple variables, at minimum the distribution is specified as ''NI'' to emphasize independence of the distributions. Some pieces of [[LinearAlgebra|linear algebra]] notation are also introduced. For example, the joint statement of [[Econometrics/Exogeneity|exogeneity]] and [[Econometrics/Homoskedasticity|homoskedasticity]] is:

{{attachment:exo.svg}}

Note how the covariance matrix is fully expressed as the [[LinearAlgebra/SpecialMatrices#Diagonal_Matrices|diagonal matrix]] of each term's variance.



== Statistics ==

There is a mixture of notations for scalar statistics. The conventional estimators for population mean ''μ'', variance ''σ^2^'', standard deviation ''σ'', covariance ''σ,,xy,,'', and correlation ''ρ,,xy,,'' are:
+There is a mixture of notations for scalar statistics. The conventional estimators for population mean ''μ'', [[Statistics/Variance|variance]] ''σ^2^'', standard deviation ''σ'', [[Statistics/Covariance|covariance]] ''σ,,xy,,'', and [[Statistics/Correlation|correlation]] ''ρ,,xy,,'' are:
-Line 56:
+Line 39:
-Based on [[Econometrics/OrdinaryLeastSquares|OLS regression]], the estimated outcome for observation ''i'' is:
+Frequently for multiple variable statistics, some pieces of [[LinearAlgebra|linear algebra]] notation are introduced. For example, covariances are frequently expressed in a covariance matrix. Covariances of ''x'' and ''y'' are specified as ''σ,,xy,,''; variances are expressed as covariances of ''x'' and ''x''.
-Line 58:
+Line 41:
-{{attachment:estimate.svg}}
+{{attachment:covariancem.svg}}
-Line 60:
+Line 43:
-No matter the regression method, the residual is:
-Line 62:
+Line 44:
-{{attachment:residual.svg}}
-Line 64:
+Line 45:
-And the coefficient of determination, a.k.a. the ''R^2^'', is:
+== Distributions ==

The [[Statistics/NormalDistribution|normal distribution]] is commonly used in econometrics, and a shorthand notation has emerged as ''x,,i,, ~ N(μ, σ)''.

For multiple variables, at minimum the distribution is specified as ''NI'' to emphasize independence of the distributions. Some pieces of [[LinearAlgebra|linear algebra]] notation are also introduced.



== Models ==

A linear model of ''k'' variables specifies constant ''β,,0,,'' and coefficients ''β,,1,,'' through ''β,,k,,''. The [[LinearAlgebra|linear algebra]] notation uses a coefficient vector ''β'' of size ''p''.

When the model is fit using [[Statistics/OrdinaryLeastSquares|regression]], the estimated coefficients are notated using a hat, as in ''ˆβ,,0,,''. The linear algebra notation uses ''b'' instead.

The predicted outcome is also notated using a hat, as in ''ŷ,,i,,''.

The generic calculation of the [[Statistics/Residuals|residual]] for observation ''i'' is ''y,,i,, - ŷ,,i,,''. The residual sum of squares (RSS) is what is minimized to fit a model.

And the coefficient of determination is: