Join 2 tables on matching columns
- Until now we've been querying a single table at a time
- One of the main strengths of a relational database is the ability to join tables together
- This enables us to piece data together from different places to answer a question
Think of a join as a left and right table:
- Left - The first table in the join (on the
FROM
line) - Right - The second table in the join (on the
INNER JOIN
line)
SELECT l.<field1>, l.<field2>, l.<field3>, r.<field1>, r.<field2>, r.<field3>
FROM <schema_name>.<table_name> | <view_name> AS l
INNER JOIN <schema_name>.<table_name> | <view_name> AS r
ON l.<field1> = r.<field1> AND l.<field2> = r.<field2>;
Terms
INNER JOIN
= Combines rows that have matching values in two or more tablesON
= The columns on which you want to join the tables- They must have compatable data and data types to get a meaningful result
Entity Relationship Diagram (ERD) Revisited
So how do we know what columns are joinable to which other table?
Let's look at the ERD:
- As an example, we can join
Sales_Header
toCustomers
.- This can be done by joining a column on each table called
customer_id
- Good naming conventions on tables often makes this job a lot easier
- This can be done by joining a column on each table called
SELECT *
FROM "sequel-mart-schema"."Sales_Header" AS sh
INNER JOIN "sequel-mart-schema"."Customers" AS cu
ON sh.customer_id = cu.customer_id;
- This returns transactions from the
Sales_Header
and more detailed information about each customer fromCustomers
- Other possible joins include:
Sales_Header
toStores
Sales_Header
toDates
Sales_Header
toSales_Detail
Sales_Detail
toProducts
Notes:
INNER JOIN
is the longer form of this type of join.- We can just use
JOIN
and the queries would run in the same way - Beware that
INNER JOIN
only returns where records exist in both tables- If something is in one table and not the other it will not be returned
Exercises:
- Join Sales_Header (alias sh) to Stores (alias st). Bring back the first 20 rows.
- Join Sales_Header (alias sh) to Dates (alias dt). Bring back the first 20 rows.
- Join Sales_Header (alias sh) to Sales_Detail (alias sd). Bring back the first 20 rows.
- Join Sales_Detail (alias sd) to Products (alias pr). Bring back the first 20 rows.