SQL SET Operators: A Comprehensive Guide

SQL SET Operators: A Comprehensive Guide

Day 51 Of 60 : Intoduction to SET Operators in SQL

SET Operators in SQL

  • SQL set operators are used to combine the results obtained from two or more queries into a single result.

  • The queries which contain two or more subqueries are known as compounded queries.

  • Set operations in SQL are used to combine or manipulate the result sets of multiple SELECT queries.

  • They allow you to perform operations similar to those in set theory, such as union, intersection, and difference, on the data retrieved from different tables or queries.

  • Set operations provide powerful tools for managing and manipulating data, enabling you to analyse and combine information in various ways.

There are four major types of SQL operators, namely:

  • Union

  • Union all

  • Intersect

  • Minus

Here is an abstract table for whatever we will be learning in this article.

SQL Set Operator

Function

Union

Combines distinct results of two or more SELECT statements.

Union All

Combines all results of two or more SELECT statements, including duplicates.

Intersect

Returns only the common records obtained from two or more SELECT statements.

Minus

Returns only those records which are exclusive to the first table.

Let us look into each set operator in more detail with examples. This article uses the following two tables to execute the queries. All the operations are performed on the comp1_employees and comp2_employees table that is given below.

comp1_employees:

employee_idemployee_nameemployee_city
1Bhim ShekhSurat
2Mehul MohanGoa
3Palash YadavAhmedabad
4Ela ShikhaDelhi
5Mrinal ThakurBangalore
6Sitara VaniBangalore
7Mohak JainGurugram
8Adesh PatelJaipur
9Kunal TandonDelhi
10Romit SoniMumbai

comp2_employees:

employee_idemployee_nameemployee_city
1Sahdev RamiahRaipur
2Mehul MohanGoa
3Mohak JainGurugram
4Pragun SarikaChennai
5Pooja SrivastavaBangalore
6Vani ShekhawatDelhi
7Mohak JainAhmedabad
8Neera ShahAhmedabad
9Poonam OberoiDelhi
10Abhishek SainiRaipur

1. UNION:

  • The UNION operator combines the result sets of two or more SELECT queries into a single result set.

  • It removes duplicates by default, meaning that if there are identical rows in the result sets, only one instance of each row will appear in the final result.

  • UNION will be used to combine the result of two select statements.

  • Duplicate rows will be eliminated from the results obtained after performing the UNION operation.

Rules for SQL UNION

  • Each table used within UNION must have the same number of columns.

  • The columns must have the same data types.

  • The columns in each table must be in the same order.

Syntax

The Syntax of the SQL UNION operator is:

SELECT columnnames FROM table1
UNION
SELECT columnnames FROM table2;

Now, let's take an example to clearly understand how the UNION operator works.

SELECT * FROM comp1_employees 
UNION  
SELECT * FROM comp2_employees;

The result of the above query is shown below:

employee_idemployee_nameemployee_city
1Bhim ShekhSurat
1Sahdev RamiahRaipur
2Mehul MohanGoa
3Mohak JainGurugram
3Palash YadavAhmedabad
4Ela ShikhaDelhi
4Pragun SarikaChennai
5Mrinal ThakurBangalore
5Pooja SrivastavaBangalore
6Sitara VaniBangalore
6Vani ShekhawatDelhi
7Mohak JainGurugram
8Adesh PatelJaipur
8Neera ShahAhmedabad
9Kunal TandonDelhi
9Poonam OberoiDelhi
10Abhishek SainiRaipur
10Romit SoniMumbai
  • As shown in the above code snippet, there are two SELECT queries, and a UNION operator is used.

  • The first SELECT query will fetch the records from comp1_employees, and the second SELECT query will fetch the records from comp2_employees, and the UNION operation is performed with the results of both the query.

  • After performing the UNION operation with both the tables, all the records from the comp1_employees table and comp2_employees table are displayed except for the duplicate data, i.e., employee_id -- 2 and 77 are duplicates. Hence, they are displayed only one time.

UNION ALL :

  • This operator combines all the records from both the queries.

  • Duplicate rows will be not be eliminated from the results obtained after performing the UNION ALL operation.

  • To successfully execute the operation of Union All, the number of columns and the data type must be the same in both tables.

The syntax of the UNION ALL operator is shown below:

SELECT expression_1, expression_2, ... , expression_n
FROM table_1
UNION ALL
SELECT expression_1, expression_2, ... , expression_n
FROM table_2

Now, let's take an example to clearly understand how the UNION ALL operator works.

SELECT * FROM comp1_employees 
UNION ALL
SELECT * FROM comp2_employees;

The result of the above query is shown below:

employee_idemployee_nameemployee_city
1Bhim ShekhSurat
2Mehul MohanGoa
3Palash YadavAhmedabad
4Ela ShikhaDelhi
5Mrinal ThakurBangalore
6Sitara VaniBangalore
7Mohak JainGurugram
8Adesh PatelJaipur
9Kunal TandonDelhi
10Romit SoniMumbai
1Sahdev RamiahRaipur
2Mehul MohanGoa
3Mohak JainGurugram
4Pragun SarikaChennai
5Pooja SrivastavaBangalore
6Vani ShekhawatDelhi
7Mohak JainGurugram
8Neera ShahAhmedabad
9Poonam OberoiDelhi
10Abhishek SainiRaipur
  • As shown in the above code snippet, there are two SELECT queries, and a UNION All operator is used. The first SELECT query will fetch the records from comp1_employees, and the second SELECT query will fetch the records from comp2_employees, and the UNION ALL operation is performed with the results of both the query.

  • After performing the UNION ALL operation with both tables, all the records from the comp1_employees table and comp2_employees table are displayed. Since it's a UNION ALL operation, all the records are displayed, including the duplicate words, which is not the case in the UNION operation.

3. INTERSECT:

  • It is used to combine two SELECT statements, but it only returns the records which are common from both SELECT statements.

  • To successfully execute the operation of INTERSECT, the number of columns and the data type must be the same in both tables.

  • After performing the INTERSECT operation, the data/records which are common in both the SELECT statements are returned.

The syntax of the INTERSECT operator is shown below:

SELECT expression_1, expression_2, ... , expression_n
FROM table_1
INTERSECT
SELECT expression_1, expression_2, ... , expression_n
FROM table_2

Now, let's take an example to clearly understand how the INTERSECT operator works.

SELECT * FROM comp1_employees 
INTERSECT
SELECT * FROM comp2_employees;

The result of the above query is shown below:

employee_idemployee_nameemployee_city
2Mehul MohanGoa
7Mohak JainGurugram
  • As shown in the above code snippet, there are two SELECT queries, and an INTERSECT operator is used. The first SELECT query will fetch the records from comp1_employees, and the second SELECT query will fetch the records from comp2_employees, and the INTERSECT operation is performed with the results of both the query.

  • After performing the INTERSECT operation with both tables, all the data/records that are common from the comp1_employees table and comp2_employees table are displayed. As you can see in the result above, the INTERSECT gets the results that exist in both queries. And the employee_id -- 2 and 77 exist in both tables; hence they are displayed in the final result.

4.MINUS/EXCEPT

  • It displays the rows which are present in the first query but absent in the second query with no duplicates.

  • The MINUS operator allows you to filter out the results which are present in the first query but absent in the second query.

  • To successfully execute the MINUS operation, the number of columns and the data type must be the same in both tables.

  • After performing the MINUS operation, the data/records which are not present in the second SELECT statement or query are displayed.

Note: The MINUS operator is supported only in Oracle databases. For other databases like SQLite, PostgreSQL, SQL server, you can use EXCEPT operator to perform similar operations.

The syntax of the EXCEPT operator is shown below:

SELECT expression_1, expression_2, ... , expression_n
FROM table_1
EXCEPT
SELECT expression_1, expression_2, ... , expression_n
FROM table_2

Let's take an example to clearly understand how the EXCEPT operator works in MySQL, which is similar to the MINUS operator.

SELECT * FROM comp1_employees 
EXCEPT
SELECT * FROM comp2_employees;

The result of the above query is shown below:

employee_idemployee_nameemployee_city
1Bhim ShekhSurat
3Palash YadavAhmedabad
4Ela ShikhaDelhi
5Mrinal ThakurBangalore
6Sitara VaniBangalore
8Adesh PatelJaipur
9Kunal TandonDelhi
10Romit SoniMumbai
  • As shown in the above code snippet, there are two SELECT queries, and an EXCEPT operator is used. The first SELECT query will fetch the records from comp1_employees, and the second SELECT query will fetch the records from comp2_employees, and the EXCEPT operation is performed with the results of both the query.

  • After performing the EXCEPT operation with both tables, all the data/records that are present in the comp1_employees table but not in the comp2_employees table are displayed. As you can see in the result above, the EXCEPT gets the results that are present only in the comp1_employees without any duplicates.

Difference Between UNION and UNION ALL

UNION and UNION ALL, both commands are used to combine the result of two or more SELECT Statements, but there are many difference in UNION and UNION ALL.

UNION vs UNION ALL
DescriptionUNIONUNION ALL
Duplication HandlingThe UNION removes duplicate rows from the result set, presenting only unique records.The UNION ALL remains all rows, including duplicates, without any elimination.
Performance ImpactThe UNION involves the additional step of identifying and eliminating duplicates, potentially impacting performance.The UNION ALL generally performs faster, as it does not have the overhead of duplicate removal.
Result Set StructureThe UNION creates a distinct result set with unique records.The UNION ALL produces a result set that includes all rows from the combined queries.
SyntaxThe syntax for both UNION and UNION ALL is similar, with the key difference lying in the choice of the operator.The syntax for both UNION and UNION ALL is similar, with the key difference lying in the choice of the operator.
Use CaseThe UNION is suitable when the emphasis is on unique records and the removal of duplicates is essential.The UNION ALL is preferred when retaining all rows, including duplicates, is acceptable or even desired.
Data VolumeFor large datasets, UNION may not be efficient due to its complex processing with duplicate checks.For large datasets, UNION ALL may be more efficient due to its simplified processing without duplicate checks.
Query OptimizationThe UNION may involve additional processing to identify and eliminate duplicates, potentially leading to a longer execution time.The UNION ALL, being less restrictive, may result in quicker query execution.
Resource UtilizationThe UNION consumes additional resources to identify and remove duplicate rows, impacting memory usage.The UNION ALL is generally more resource-efficient as it bypasses the duplicate elimination step.
CompatibilityBoth UNION and UNION ALL are widely supported across various relational database management systems (RDBMS).Both UNION and UNION ALL are widely supported across various relational database management systems (RDBMS).
Consideration for Result AccuracyIf eliminating duplicates is crucial for the accuracy of the result set, then UNION is the appropriate choice.When duplicate records are acceptable or necessary, and performance is a priority, UNION ALL is the preferred option.

Difference between JOIN and UNION in SQL

JOINUNION
JOIN combines data from many tables based on a matched condition between themSQL combines the result set of two or more SELECT statements.
It combines data into new columns.It combines data into new rows
The number of columns selected from each table may not be the same.The number of columns selected from each table should be the same.
Datatypes of corresponding columns selected from each table can be different.The data types of corresponding columns selected from each table should be the same.
It may not return distinct columns.It returns distinct rows.

Conclusion :

  • SQL supports various set operators, which you can use to filter out the desired data from the SELECT queries/tables.

  • The number of columns in the SELECT statement on which you want to apply the SQL set operators and the data type must be the same.

  • The UNION operator combines the result of two or more SELECT statements and eliminates the duplicate rows from the result.

  • UNION ALL is similar to UNION, but a difference is that UNION ALL does not eliminate the duplicates, and it displays all the data/records.

  • INTERSECT operator only returns the record, which is common from both the SELECT statements.

  • The MINUS operator combines the result of two statements and returns only those that are present in the first SELECT query statement.