Differences between revisions 1 and 6 (spanning 5 versions)
Revision 1 as of 2023-10-28 05:18:15
Size: 1390
Comment:
Revision 6 as of 2023-10-28 07:04:18
Size: 2049
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Linear Regression = = Ordinary Least Squares =
Line 3: Line 3:
A linear regression expresses the linear relation of a treatment variable to an outcome variable. '''Ordinary Least Squares''' ('''OLS''') is a linear regression method. It minimizes root mean square errors.
Line 11: Line 11:
== Regression Line ==

A regression line can be especially useful on a scatter plot.
== Univariate ==
Line 22: Line 20:

----



== Regression Computation ==
Line 81: Line 73:
----



== Linear Model ==

The linear model can be expressed as:

{{attachment:model1.svg}}

If these assumptions can be made:

 1. Linearity
 2. Exogeneity

{{attachment:model2.svg}}

 3.#3 Random sampling
 4. No perfect multicolinearity
 5. Heteroskedasticity

Then OLS is the best linear unbiased estimator ('''BLUE''') for these coefficients.

Using the computation above, the coefficients are estimated to produce:

{{attachment:model3.svg}}

The variance for each coefficient is estimated as:

{{attachment:model4.svg}}

Where R^2^ is calculated as:

{{attachment:model5.svg}}

Note also that the standard deviation of the population's parameter is unknown, so it's estimated like:

{{attachment:model6.svg}}

Ordinary Least Squares

Ordinary Least Squares (OLS) is a linear regression method. It minimizes root mean square errors.


Univariate

The regression line passes through two points:

[ATTACH]

and

[ATTACH]

Take the generic equation form of a line:

[ATTACH]

Insert the first point into this form.

[ATTACH]

This can be trivially rewritten to solve for a in terms of b:

[ATTACH]

Insert the second point into the original form.

[ATTACH]

Now additionally insert the solution for a in terms of b.

[ATTACH]

Expand all terms to produce:

[ATTACH]

This can now be eliminated into:

[ATTACH]

Giving a solution for b:

[ATTACH]

This solution is trivially rewritten as:

[ATTACH]

Expand the formula for correlation as:

[ATTACH]

This can now be eliminated into:

[ATTACH]

Finally, b can be eloquently written as:

[ATTACH]

Giving a generic formula for the regression line:

[ATTACH]


Linear Model

The linear model can be expressed as:

model1.svg

If these assumptions can be made:

  1. Linearity
  2. Exogeneity

model2.svg

  1. Random sampling
  2. No perfect multicolinearity
  3. Heteroskedasticity

Then OLS is the best linear unbiased estimator (BLUE) for these coefficients.

Using the computation above, the coefficients are estimated to produce:

[ATTACH]

The variance for each coefficient is estimated as:

[ATTACH]

Where R2 is calculated as:

[ATTACH]

Note also that the standard deviation of the population's parameter is unknown, so it's estimated like:

[ATTACH]


CategoryRicottone

Statistics/OrdinaryLeastSquares (last edited 2025-01-10 14:33:38 by DominicRicottone)