SQL Joins
Left Join
Query all rows in the left-hand side, matching data from the right-hand side if possible but returning NULL data otherwise.
One common use case is joining derivative data back to the original table.
SELECT *
FROM people
LEFT JOIN (
SELECT FamilyID, COUNT(*) AS FamilySize FROM people GROUP BY FamilyID
) AS r
ON people.FamilyID = r.FamilyIDCan take multiple right-hand side tables.
SELECT *
FROM people
LEFT JOIN (
SELECT FamilyID, COUNT(*) AS FamilySize FROM people GROUP BY FamilyID
) AS r1
ON people.FamilyID = r1.FamilyID
LEFT JOIN (
SELECT FamilyID, ID as HeadOfHouseholdID
FROM (
SELECT FamilyID, ID, ROW_NUMBER() OVER (PARTITION BY FamilyID) AS row_number
FROM people
)
WHERE row_number = 1
) AS r2
ON people.FamilyID = r2.FamilyID
Tips
Duplicate Columns
Joins will result in duplicate columns, if only on the key column used in the ON clause. Some engines (like sqlite) will store these duplicates with sequential names like ID:1.
Sometimes the only solution is to avoid *, and actually list the requested columns.
UPDATE FROM
In some circumstances, UPDATE FROM is a more appropriate option.
UPDATE people
SET LastName = r.LastName,
FirstName = r.FirstName,
FROM (
SELECT LastName, FirstName
FROM people_pii
) AS r
WHERE people.ID = r.ID
ON vs WHERE
The following queries are not interchangeable: ON clauses are evaluated at match time, while WHERE clauses are evaluated after matching.
SELECT * FROM transactions LEFT JOIN people ON people.ID=transactions.ID WHERE transactions.ID = 1
This will return only the rows in people where ID = 1.
SELECT *
FROM transactions
LEFT JOIN people ON people.ID=transactions.ID
AND transactions.ID = 12345This will return all rows in people, but joined columns will only be non-NULL for the rows where ID = 1.
