SQL Joins


Left Join

Query all rows in the left-hand side, matching data from the right-hand side if possible but returning NULL data otherwise.

One common use case is joining derivative data back to the original table.

SELECT *
FROM people
LEFT JOIN (
           SELECT FamilyID, COUNT(*) AS FamilySize FROM people GROUP BY FamilyID
) AS r
ON people.FamilyID = r.FamilyID

Can take multiple right-hand side tables.

SELECT *
FROM people
LEFT JOIN (
           SELECT FamilyID, COUNT(*) AS FamilySize FROM people GROUP BY FamilyID
) AS r1
ON people.FamilyID = r1.FamilyID
LEFT JOIN (
           SELECT FamilyID, ID as HeadOfHouseholdID
           FROM (
                 SELECT FamilyID, ID, ROW_NUMBER() OVER (PARTITION BY FamilyID) AS row_number
                 FROM people
           )
           WHERE row_number = 1
) AS r2
ON people.FamilyID = r2.FamilyID


Tips

Duplicate Columns

Joins will result in duplicate columns, if only on the key column used in the ON clause. Some engines (like sqlite) will store these duplicates with sequential names like ID:1.

Sometimes the only solution is to avoid *, and actually list the requested columns.

UPDATE FROM

In some circumstances, UPDATE FROM is a more appropriate option.

UPDATE people
SET LastName = r.LastName,
    FirstName = r.FirstName,
FROM (
      SELECT LastName, FirstName
      FROM people_pii
) AS r
WHERE people.ID = r.ID

ON vs WHERE

The following queries are not interchangeable: ON clauses are evaluated at match time, while WHERE clauses are evaluated after matching.

SELECT *
FROM transactions
LEFT JOIN people ON people.ID=transactions.ID
WHERE transactions.ID = 1

This will return only the rows in people where ID = 1.

SELECT *
FROM transactions
LEFT JOIN people ON people.ID=transactions.ID
    AND transactions.ID = 12345

This will return all rows in people, but joined columns will only be non-NULL for the rows where ID = 1.


CategoryRicottone

SQL/Joins (last edited 2025-10-05 16:44:08 by DominicRicottone)