Let’s delve into the study of table joins in SQL — a crucial topic that often elicits numerous questions. This emphasis is intentional, aimed at guiding you through this complex subject step by step, starting from the fundamentals.
Our approach is meticulously didactic to ensure you grasp the core concepts. This foundational knowledge is vital as we progress to crafting increasingly intricate queries.
Initially, we must prepare and load the data. What we have is the establishment of a SCHEMA, the creation of the initial table CUSTOMERS, and populating this table with fictitious data:
The process involves a straightforward copy (Ctrl + C) and paste (Ctrl + V) action into MySQL Workbench.
Thus, the first step is to create the schema named chapter04. While it’s possible to create the schema using the right-click context menu and selecting the “Create Schema” option, we will proceed by executing this operation through code.
This approach not only familiarizes you with SQL syntax but also prepares you for more complex schema creation and manipulation tasks.
CREATE SCHEMA 'chapter04';
The next step involves creating the CUSTOMERS table. The creation of a table falls under the category of Data Definition Language (DDL) instructions.
Every CREATE command is a DDL instruction in SQL, aimed at constructing database objects. Therefore, both CREATE SCHEMA and CREATE TABLE are DDL commands intended to create something within the database.
In this case, we aim to create the CUSTOMERS table within chapter04, which will include the following columns: CUSTOMER_ID
, CUSTOMER_NAME
, CUSTOMER_ADDRESS
, CUSTOMER_CITY
, and CUSTOMER_STATE
.
CREATE TABLE 'chapter04'.'CUSTOMERS' (
'customer_id' INT NULL,
'customer_name' VARCHAR(50) NULL,
'customer_address' VARCHAR(50) NULL,
'customer_city' VARCHAR(50) NULL,
'customer_state' VARCHAR(50) NULL
);
Following this, we will populate the CUSTOMERS table with fictitious client data, utilizing the INSERT command to feed the table.
INSERT falls under Data Manipulation Language (DML), which pertains to altering data in some manner. As suggested by its name, INSERT is used to add records to a table.
Hence, we use INSERT INTO, indicating that what follows in uppercase is SQL syntax, the only exception being the table’s name.
Thus, INSERT INTO is the SQL command used to insert data into chapter04 in the CUSTOMERS table, specifying the columns CUSTOMER_ID
, CUSTOMER_NAME
, CUSTOMER_ADDRESS
, CUSTOMER_CITY
, and CUSTOMER_STATE
. Following this, the VALUES are the actual data to be inserted.
This structure outlines the column titles at the top and their corresponding values below. The script includes several fictitious records for different customers.
Upon right-clicking on our CUSTOMERS table and selecting the option Select Rows — Limit 1000, you will achieve the following result, which includes inserting data into the CUSTOMERS table and then selecting all records from it to verify the insertion:
INSERT INTO `chapter04`.`CUSTOMERS` (`customer_id`, `customer_name`, `customer_address`, `customer_city`, `customer_state`)
VALUES (1, "Tom Morello", "Street 67", "Orlando", "FL");
INSERT INTO `chapter04`.`CUSTOMERS` (`customer_id`, `customer_name`, `customer_address`, `customer_city`, `customer_state`)
VALUES (2, "Cris Ronaldo", "Street 64", "Austin", "TX");INSERT INTO `chapter04`.`CUSTOMERS` (`customer_id`, `customer_name`, `customer_address`, `customer_city`, `customer_state`)
VALUES (3, "John Lennon", "Street 42", "Miami", "FL");INSERT INTO `chapter04`.`CUSTOMERS` (`customer_id`, `customer_name`, `customer_address`, `customer_city`, `customer_state`)
VALUES (4, "Billy Joel", "Street 39", "Las Vegas", "NV");INSERT INTO `chapter04`.`CUSTOMERS` (`customer_id`, `customer_name`, `customer_address`, `customer_city`, `customer_state`)
VALUES (5, "Lady Gaga", "Street 45", "Albany ", "NY");SELECT * FROM chapter04.CUSTOMERS;
This script effectively populates the CUSTOMERS table with fictitious customer data, showcasing a variety of locations across the United States.
The SELECT * FROM chapter04.CUSTOMERS; command is used to display all entries in the CUSTOMERS table, ensuring that the data insertion process was successful.
Following the creation and population of the CUSTOMERS table, the next step involves creating the second table, ORDERS, and similarly loading it with data. This step further expands our database schema, allowing us to explore more complex SQL joins and queries.
CREATE TABLE 'chapter04'.'ORDERS' (
'order_id' INT NULL,
'customer_id' INT NULL,
'salesperson_id' INT NULL,
'order_date' DATETIME NULL,
'delivery_id' INT NULL);
INSERT INTO `chapter04`.`ORDERS` (`order_id`, `customer_id`, `salesperson_id`, `order_date`, `delivery_id`)
VALUES (1001, 1, 5, now(), 23);INSERT INTO `chapter04`.`ORDERS` (`order_id`, `customer_id`, `salesperson_id`, `order_date`, `delivery_id`)
VALUES (1002, 1, 7, now(), 24);INSERT INTO `chapter04`.`ORDERS` (`order_id`, `customer_id`, `salesperson_id`, `order_date`, `delivery_id`)
VALUES (1003, 2, 5, now(), 23);SELEC * FROM 'chapter04'.'ORDERS'
In the INSERT INTO statements for the ORDERS table, note that the ORDER_DATE
column is of the DATETIME type, necessitating the insertion of values in the DATETIME format.
To achieve this, we use the NOW()
function, which captures the exact date and time (hour, minute, second, day, month, year) at the moment of executing the command on the system.
It’s important to mention that SQL functions, like NOW()
, can vary slightly between different SQL Database Management Systems (DBMSs), as each implements the SQL language based on the ANSI standard but may introduce minor modifications.
Therefore, when working with specific SQL functions, consulting the documentation of the DBMS you are using is advisable.
However, using the NOW()
function is not mandatory; one could prepare a date, convert it to the DATETIME format, and then proceed with the insertion into the database.
With these steps, we have successfully inserted data into the second table, ORDERS.
We then move on to creating the third table, SALESPEOPLE, and perform the final data loading operation.
CREATE TABLE `chapter04`.`SALESPEOPLE` (
`salesperson_id` INT NULL,
`salesperson_name` VARCHAR(50) NULL
);
INSERT INTO `chapter04`.`SALESPEOPLE` (`salesperson_id`, `salesperson_name`)
VALUES (1, "Salesperson 1");INSERT INTO `chapter04`.`SALESPEOPLE` (`salesperson_id`, `salesperson_name`)
VALUES (2, "Salesperson 2");INSERT INTO `chapter04`.`SALESPEOPLE` (`salesperson_id`, `salesperson_name`)
VALUES (3, "Salesperson 3");INSERT INTO `chapter04`.`SALESPEOPLE` (`salesperson_id`, `salesperson_name`)
VALUES (4, "Salesperson 4");INSERT INTO `chapter04`.`SALESPEOPLE` (`salesperson_id`, `salesperson_name`)
VALUES (5, "Salesperson 5");INSERT INTO `chapter04`.`SALESPEOPLE` (`salesperson_id`, `salesperson_name`)
VALUES (6, "Salesperson 6");INSERT INTO `chapter04`.`SALESPEOPLE` (`salesperson_id`, `salesperson_name`)
VALUES (7, "Salesperson 7");SELECT * FROM chapter04.SALESPEOPLE;
To begin, the process essentially involves a simple copy (Ctrl + C) and paste (Ctrl + V) action into MySQL Workbench.
Hence, the initial step is to create the schema chapter04. Although it’s possible to create the schema using the right-click context menu and selecting Create Schema, we will proceed with this operation via code for each of the tables. Subsequently, we will delve into our joins.
From this point forward, I will construct queries alongside you. I highly recommend you follow along by executing the codes. Create the queries with me and understand the processes involved.
The first objective is to return the ORDER_ID
and CUSTOMER_NAME
— a query that retrieves these two pieces of information for inclusion in reports or charts, as needed.
We already know that CUSTOMER_NAME
is located in the CUSTOMERS table. However, do we have this column in any other table? No. Therefore, we need to fetch this information from the CUSTOMERS table.
Similarly, we want the ORDER_ID
, which is only available in the ORDERS table. Thus, we are clearly looking to retrieve information from two distinct tables. This is where the concept of table joining precisely comes into play. We aim to fetch data from two tables and establish a relationship between them to access the desired data.
For this relationship to be feasible, it’s not a random process. There must be a linking column between the tables. Looking at the ORDERS table, we have the `customer_id` column, indicating that for each order, there is a customer who placed that order.
This customer should be registered in the CUSTOMERS table, which also contains the CUSTOMER_ID
.
A question to consider: is it possible to make an order without an associated customer? Practically speaking, No!
For an order to exist, a customer must be there to place it. This is a fundamental modeling decision (and quite logical at that).
Customers are registered in the CUSTOMERS table, each assigned a unique ID. That’s why CUSTOMER_ID
is referred to as the primary key. It wouldn’t make sense to have two customers with the same ID, would it?
For every order placed in that company or system, there is a customer associated with that order. So, how do we retrieve the information we need? We apply the INNER JOIN!
SELECT O.order_id, C.customer_name
FROM chapter04.ORDERS AS O
INNER JOIN chapter04.CUSTOMERS AS C
ON O.customer_id = C.customer_id;
This query performs a SELECT
operation on ORDER_ID
from the ORDERS
table (aliased as O
) and CUSTOMER_NAME
from the CUSTOMERS
table (aliased as C
), within the chapter04
schema.
The INNER JOIN
is then utilized, linking the ORDERS
table (O
) with the CUSTOMERS
table (C
) by the common column CUSTOMER_ID
is, establishing the relationship necessary to retrieve the combined dataset.
The observation that only 3 orders are returned for 2 out of 5 registered customers illustrates a fundamental aspect of the INNER JOIN: it only returns data when there is a relationship in both tables. In this context, orders can only be retrieved when associated with a customer.
The INNER JOIN establishes a connection between tables based on a common relationship, such as the CUSTOMER_ID
column, and returns data when this relationship exists.
Each order is individual, with the 3 orders placed by two customers, Tom Morello and Cris Ronaldo.
This demonstrates a practical application of the INNER JOIN, which is typically used to return data where a clear relationship exists, without orphaned records, indicating tightly linked data.
However, an INNER JOIN can also be performed using a different syntax:
SELECT O.order_id, C.customer_name
FROM chapter04.ORDERS AS O,
chapter04.CUSTOMERS AS C
WHERE O.customer_id = C.customer_id;
This approach selects the ORDER_ID
and CUSTOMER_NAME
from the ORDERS and CUSTOMERS tables, respectively, but instead of using the `INNER JOIN` clause, it directly compares the two columns using the WHERE clause.
Comparing the two approaches:
- Using INNER JOINSELECT O.order_id, C.customer_name
FROM chapter04.ORDERS AS O
INNER JOIN chapter04.CUSTOMERS AS C
ON O.customer_id = C.customer_id;
- Using WHERESELECT O.order_id, C.customer_name
FROM chapter04.ORDERS. AS O,
chapter04.CUSTOMERS AS C
WHERE O.customer_id = C.customer_id;
The outcome is the same for both queries, illustrating the concept of INNERJOIN — the intersection in set theory taught in elementary education.
The intersection is the set of values common to both sets, mirroring the INNER JOIN’s purpose: to return data where there’s a relationship between the CUSTOMER_ID
of one table with the CUSTOMER_ID
of another, embodying the concept of intersection.
Regarding which query to use, it boils down to personal preference. While some may prefer the directness and simplicity of using the WHERE clause for its ease of understanding and direct table linkage, others might favor the clarity and explicitness of the INNER JOIN syntax.
Regardless of preference, it’s essential to be mindful of the syntax differences — INNER JOIN uses the ON clause for column linkage, whereas the WHERE clause approach does not, going directly into the table linkage.
Return order_id, customer_name, & salesperson_name
When joining three tables, it can get a bit more involved, yet SQL allows for an arbitrary number of tables to be joined via INNER JOIN, without a known limit.
However, if you find yourself needing to join more than five or six tables, it might be worthwhile to reassess the query. Often, it’s possible to rewrite the query to avoid fetching data from so many distinct tables, which can simplify the logic and potentially improve performance.
Here’s an example of how to perform an INNER JOIN across three tables, linking ORDERS, CUSTOMERS, and SALESPEOPLE to retrieve ORDER_ID
, CUSTOMER_NAME
, and SALESPERSON_NAME
:
- Inner Join with 3 tablesSELECT O.order_id, C.customer_name, S.salesperson_name
FROM chapter04.ORDERS AS O
INNER JOIN chapter04.CUSTOMERS AS C ON O.customer_id = C.customer_id
INNER JOIN chapter04.SALESPEOPLE AS S ON O.salesperson_id = S.salesperson_id;
This query demonstrates the intersection principle by fetching the number of records from the table with the least data.
In this scenario, the result includes 3 orders, with 2 customers and now 3 salespeople (salesperson 5, 6, and 7).
Alternatively, you can rewrite this query without using INNER JOIN, leveraging the WHERE clause for a potentially simpler understanding.
However, using the WHERE clause to simulate joins across multiple tables might lead to complex and less readable queries, especially as the number of tables increases.
Therefore, while INNER JOIN and WHERE can both achieve similar results, INNER JOIN tends to be more explicit and readable for complex relationships, making it the preferred approach for joining multiple tables.
-- Using WHERE clause for joins
SELECT O.order_id, C.customer_name, S.salesperson_name
FROM chapter04.ORDERS AS O,
chapter04.CUSTOMERS. AS C,
chapter04.SALESPEOPLE AS S
WHERE O.customer_id = C.customer_id
AND O.salesperson_id = S.salesperson_id;
Both queries will return the order identifier, the customer’s name, and the salesperson’s name associated with each order.
I don’t want to bias you towards using one or the other, but in my opinion, the query with the WHERE clause is a bit easier to understand.
I am once again selecting the fields I want from the three tables and then use the WHERE clause with two filters. First, comparing CUSTOMER_ID
and then comparing SALESPERSON_ID
.
So, this is the INNER JOIN, the type of join that, in practice, represents the intersection in set theory.
Before diving into other types of table joins in SQL, let’s delve deeper into INNER JOIN, as it’s the most commonly used type of join.
Additionally, INNER JOIN can be utilized alongside other SQL clauses, such as WHERE, ORDER BY, GROUP BY, and more. I’ll provide some examples to clarify this.
The INNER JOIN syntax I previously introduced adheres to the ANSI standard, which is an international convention. What does this mean?
Essentially, any Database Management System (DBMS) that supports the ANSI standard will allow you to execute queries in this format.
Nowadays, almost all DBMSs support the ANSI standard, so compatibility shouldn’t be an issue. The techniques I’m teaching you will work across any DBMS, provided it conforms to the ANSI standard. If in doubt, always refer to your DBMS’s documentation for confirmation.
# Inner Join - ANSI StandardSELECT O.order_id, C.customer_name
FROM chapter04.ORDERS. AS O
INNER JOIN chapter04.CUSTOMERS AS C
ON O.customer_id = C.customer_id;
Now, let’s delve into what happens when we modify our SQL command by omitting the word INNER from our JOIN operation.
To uncover the effects of this adjustment, an experiment is necessary. By simply using JOIN’in our query, we proceed to execute it. The outcome of this change reveals…
When you execute a query with INNER JOIN, you are performing a common and essential operation in SQL to combine rows from two or more tables.
Now, what happens if you remove the word INNER and just leave JOIN? To find out, it’s necessary to experiment. By executing with just JOIN, the result is the same as using INNER JOIN.
What does this mean? If you use just JOIN, you are implicitly executing an INNER JOIN, which represents the intersection between tables, based on set theory principles we’ve discussed so far.
You have the option to use INNER JOIN or simply JOIN. However, pay attention when working with LEFT JOIN or RIGHT JOIN, which I will introduce shortly. For these, you must explicitly specify LEFT or RIGHT. Omitting INNER defaults to a basic JOIN, which is the intersection of the tables.
Furthermore, I’ve used the reserved word ON to join the tables using the CUSTOMER_ID
column, a common column in both tables.
Since the CUSTOMER_ID
column name is the same in both tables, this query can be constructed slightly differently by using the reserved word USING.
# INNER JOIN when column name is the same in both tablesSELECT O.order_id, C.customer_name
FROM chapter04.ORDERS AS O
INNER JOIN chapter04.CUSTOMERS AS C
USING (customer_id)
When the column names are identical in both tables involved in a join, you can streamline your query by replacing the ON clause with USING.
By simply specifying the common column name within the USING clause, you achieve the same result as with ON, thus simplifying the syntax for the join operation.
A Clear and straightforward code will be immensely beneficial when you, or others, revisit it in the future to understand what was accomplished. Avoid creating overly complex code, as it complicates understanding and maintenance for everyone involved in the solution.
Let’s explore an example where we combine INNER JOIN, WHERE, and ORDER BY. This approach demonstrates how you can efficiently filter and sort the results of your joins, providing a powerful tool for querying and managing data.
SELECT O.order_id, C.customer_name
FROM chapter04.ORDERS AS O
INNER JOIN chapter04.CUSTOMERS AS C
ON O.customer_id = C.customer_id
WHERE C.customer_name LIKE 'Tom%'
ORDER BY O.order_id DESC;
To filter data where CUSTOMER_NAME
includes “Tom”, the **LIKE** operator is used to match patterns similar to the specified criteria.
Using LIKE ‘Tom% selects records where the customer’s name starts with Tom followed by any characters, with % symbolizing any sequence following Tom.
Subsequently, to organize the filtered results, ORDER BY ORDER_ID
DESC sorts the data by ORDER_ID
in descending order, presenting the highest ORDER_ID
first.
Now, I have filtered the results to only include customers named Tom, and then I sorted the results by ORDER_ID
in descending order.
Remember, the INNER JOIN is just a part of your query. You can use INNER JOIN with any other SQL clause, demonstrating that SQL queries can become quite intriguing.
Let’s explore another type of table join. I aim to be very didactic so that we can grasp the concept, which is the most important part. Understanding what is being done here is crucial, rather than focusing solely on the query itself.
In this example, using INNER JOIN, the goal is to return the ORDER_ID
from the ORDERS table and the CUSTOMER_NAME
from the CUSTOMERS table.
# Return order ID and customer nameSELECT O.order_id, C.customer_name
FROM chapter04.ORDERS AS O
INNER JOIN chapter04.CUSTOMERS AS C
ON O.customer_id = C.customer_id;
Given that some customers have not yet placed orders, the requirement is to craft a query that retrieves this specific information — returning all customers, with or without associated orders.
This scenario cannot be solved using an INNER JOIN, as it would only fetch customers who have placed orders. However, the goal to include customers without orders in the report, and subsequently create a chart or provide this information to management, can be efficiently met using a LEFT JOIN.
This join type is essential because it guarantees the inclusion of all records from the left table in the results, even if there are no corresponding matches in the right table.
# Left Join – All the data from the left table even without a corresponding match in the right tableSELECT C.customer_name, O.order_id
FROM chapter04.CUSTOMERS AS C
LEFT JOIN chapter04.ORDERS AS O
ON C.customer_id = O.customer_id;
The order of the tables significantly affects the outcome of your query!
Consider selecting CUSTOMER_NAME
and ORDER_ID
from the CUSTOMERS table, aliased as C, and performing a LEFT JOIN with the ORDERS table, aliased as O.
Specify the columns for establishing the relationship. Executing this will return all customers, with associated orders where available; orders are indicated by ORDER_ID
, and absence of orders is represented by NULL.
To enhance the presentation of the results, replacing NULL with a more descriptive value could be done using the CASE statement. This could indicate the absence of an order more clearly, though for this example, the distinction is not crucial.
The LEFT JOIN operation ensures all records from the left table are returned, even without a corresponding match in the right table.
A slight modification to the query involves adding OUTER after LEFT, forming a LEFT OUTER JOIN. Let’s examine the impact of this small adjustment on the query’s results:
# Left Outer JoinSELECT C.customer_name, O.order_id
FROM chapter04.CUSTOMERS AS C
LEFT OUTER JOIN chapter04.ORDERS AS O
ON C.customer_id = O.customer_id;
The result was exactly the same, indicating that using LEFT OUTER JOIN is equivalent to using LEFT JOIN. However, it’s important to note that this is not the same as using JOIN by itself.
Now, let’s remove the LEFT to see the outcome and further understand the distinctions between these join types.
# Join SELECT C.customer_name, O.order_id
FROM chapter04.CUSTOMERS AS C
JOIN chapter04.ORDERS AS O
ON C.customer_id = O.customer_id;
As emphasized for clarity, using JOIN alone results in only the intersection of the tables being returned.
Conversely, LEFT JOIN retrieves data from the left table even if there are no corresponding entries in the right table.
Additionally, you can use LEFT OUTER JOIN interchangeably with LEFT JOIN, as they yield the same outcome.
What happens if we reverse the order of the tables? There’s only one way to find out — by testing and experimenting!
We use the same query but now switch the positions of the CUSTOMERS and ORDERS tables in our FROM and JOIN clauses.
# Reversing the Order of TablesSELECT C.customer_name, O.order_id
FROM chapter04.ORDERS AS O
LEFT JOIN chapter04.CUSTOMERS AS C
ON C.customer_id = O.customer_id;
Executing the query with the tables reversed results in a different outcome, akin to an INNER JOIN. This difference underscores the significance of table order when using LEFT or RIGHT joins.
This is logical, as specifying LEFT or RIGHT in the join instructs the SQL execution engine to prioritize the left or right table, respectively. To further elaborate on this concept, let’s explore the RIGHT JOIN.
We execute essentially the same query, substituting LEFT with RIGHT, to observe the outcome. The goal now is to retrieve all data from the right table, even if there is no corresponding entry in the left table.
In practice, this involves reversing the order of the tables and using RIGHT instead of LEFT.
SELECT C.customer_name, O.order_id
FROM chapter04.ORDERS AS O
RIGHT JOIN chapter04.CUSTOMERS AS C
ON C.customer_id = O.customer_id;
Observing that executing either the reversed LEFT JOIN or the RIGHT JOIN in their respective scenarios yields the same results highlights an important decision-making aspect in SQL querying: whether to use LEFT or RIGHT joins depends on the order of the tables.
Sometimes, due to the complexity of a query and the effort required to change table orders, it might be more practical to adjust between LEFT and RIGHT joins.
The preference for always organizing table orders to utilize a LEFT JOIN for desired outcomes is a personal strategy aimed at establishing a standard across queries. This approach is believed to facilitate learning and memorization by maintaining consistency.
After exploring these concepts, it’s proposed to tackle a more challenging query: returning the order date, customer name, all sellers, regardless of whether they are associated with an order, and sorting the results by customer name.
This exercise will require applying the full range of SQL knowledge covered from the basics to data science applications.
It’s crucial to understand that no one will specify whether to use **LEFT JOIN, RIGHT JOIN, or INNER JOIN in real-world scenarios; it’s your responsibility to determine the most appropriate join based on the end goal.
This emphasizes the importance of problem-solving skills in SQL learning, as business domains may not be familiar with technical terms like INNER JOIN, nor is it their role to be. The onus is on you to select the best method for data retrieval based on the objectives at hand.
SELECT O.order_date, C.customer_name, S.salesperson_name
FROM ((chapter04.ORDERS AS O
JOIN chapter04.CUSTOMERS AS C ON O.customer_id = C.customer_id)
RIGHT JOIN chapter04.SALESPEOPLE AS S ON O.salesperson_id = S.salesperson_id)
ORDER BY C.customer_name;
For the scenario outlined, the query effectively addresses the challenge by returning the order date, customer name, and seller name, including sellers without associated orders.
If there’s no order linked to a seller, resulting in no order date or customer name, the output shows NULL.
This query joins the tables by starting with ORDERS, using an INNER JOIN with CUSTOMERS on CUSTOMER_ID
, followed by a RIGHT JOIN with SALESPEOPLE on SALESPERSON_ID
, and finally sorting by CUSTOMER_NAME
.
However, to enhance the presentation, especially for executive reports, addressing NULL values becomes essential. Delivering a report with raw NULL values might not be the most informative or visually appealing approach.
The solution involves treating all NULL values in the results. Specifically, for ORDER_DATE
and CUSTOMER_NAME
fields that return NULL, the query will replace them with “No Order”.
This ensures that the report delivered to management contains meaningful placeholders instead of raw NULL values, enhancing clarity and professionalism.
This approach exemplifies going beyond the basic requirements to deliver work that stands out in daily operations, showcasing the transformative effect of extra effort.
SELECT
CASE
WHEN O.order_date IS NULL THEN "No Order"
ELSE O.order_date
END AS order_date,
CASE
WHEN C.customer_name IS NULL THEN "No Order"
ELSE C.customer_name
END AS customer_name,
S.salesperson_name
FROM chapter04.ORDERS AS O
RIGHT JOIN chapter04.SALESPEOPLE AS S ON O.salesperson_id = S.salesperson_id
LEFT JOIN chapter04.CUSTOMERS AS C ON O.customer_id = C.customer_id
ORDER BY C.customer_name;
This SQL query extracts a list of sales orders along with customer and salesperson details from a sales database.
To address instances where an order or a customer may not be associated with a salesperson’s record, CASE statements are employed. If there’s no order date (ORDER_DATE
is NULL), the query displays “No Order” instead of a NULL value.
Likewise, for missing customer names (CUSTOMER_NAME
is NULL), it initially shows “No Order”, which should be corrected to “No Customer” to accurately represent situations without an associated customer.
The query strategically employs a RIGHT JOIN to include all salespeople, even those without linked orders, and a LEFT JOIN for integrating customer details when available.
The results are sorted by customer name, facilitating an alphabetically organized list that’s beneficial for sales activity review or customer-focused report generation.
With these enhancements, the query now presents a more refined and professional outcome. The formatted results are ready to be displayed in a table or saved to a CSV file, which can then be imported into tools like Power BI for graphical analysis.
This allows for quick identification of salespeople without associated orders, enabling the sales team to refine their performance and address any underperformances with readily available, well-structured information.
This discussion wraps up the exploration of three primary types of joins in SQL: INNER JOIN, LEFT JOIN, and RIGHT JOIN. However, the journey into SQL’s capabilities is far from over. Next, we will intentionally provoke Referential Integrity Errors to deepen our understanding.
Stay tuned for more insights in the next installment.🐼❤️