SQL Theory Notes
SQL Theory Notes
SQL Theory Notes
What is Database?
Database is a collection of interrelated data.
What is DBMS?
DBMS (Database Management System) is software used to create, manage, and
organize databases.
What is RDBMS?
● RDBMS (Relational Database Management System) - is a DBMS based on
the concept of tables (also called relations).
● Data is organized into tables (also known as relations) with rows (records) and
columns (attributes).
● Eg - MySQL, PostgreSQL, Oracle etc.
What is SQL?
SQL is Structured Query Language - used to store, manipulate and retrieve data from
RDBMS.
(It is not a database, it is a language used to interact with database)
*Note - SQL keywords are NOT case sensitive. Eg: select is the same as SELECT in SQL.
1
SQL Data Types
In SQL, data types define the kind of data that can be stored in a column or variable.
*Note - CHAR is for fixed length & VARCHAR is for variable length strings. Generally,
VARCHAR is better as it only occupies necessary memory & works more efficiently.
We can also use UNSIGNED with datatypes when we only have positive values to
add. Eg - UNSIGNED INT
2. DDL (Data Definition Language) : Used to create, alter, and delete database objects
4. DCL (Data Control Language): Used to grant & revoke permissions. (GRANT,
REVOKE)
DDL commands enable you to create, modify, and delete database objects like
tables, indexes, constraints, and more.
● CREATE TABLE:
● ALTER TABLE:
● DROP TABLE:
○ Used to delete an existing table along with its data and structure.
○ Example: DROP TABLE employees;
3
● CREATE INDEX:
● DROP INDEX:
● CREATE CONSTRAINT:
● DROP CONSTRAINT:
● TRUNCATE TABLE:
○ Used to delete the data inside a table, but not the table itself.
○ Syntax – TRUNCATE TABLE table_name
DQL (Data Query Language) is a subset of SQL focused on retrieving data from databases.
The SELECT statement is the foundation of DQL and allows us to extract specific
columns from a table.
● SELECT:
4
Syntax: SELECT column1, column2, ... FROM table_name;
Here, column1, column2, ... are the field names of the table.
● WHERE:
= : Equal
> : Greater than
< : Less than
>= : Greater than or equal
<= : Less than or equal
<> : Not equal.
- The WHERE clause can be combined with AND, OR, and NOT operators.
- The AND and OR operators are used to filter records based on more than
one condition:
- The AND operator displays a record if all the conditions separated by AND are
TRUE.
Syntax:
5
SELECT column1, column2, ... FROM table_name WHERE condition1 AND condition2
AND condition3 ...;
Example:
● DISTINCT:
● LIKE:
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.
There are two wildcards often used in conjunction with the LIKE operator:
6
WHERE CustomerName LIKE 'a_%'
- Finds any values that start with "a" and are at least 2 characters in length
● IN:
● BETWEEN:
● IS NULL:
● AS:
● ORDER BY
The ORDER BY clause allows you to sort the result set of a query based on one or
more columns.
Basic Syntax:
- The ORDER BY clause is used after the SELECT statement to sort query results.
7
- Syntax: SELECT column1, column2 FROM table_name ORDER BY column1
[ASC|DESC];
- You can sort by multiple columns by listing them sequentially in the ORDER
BY clause.
- Rows are first sorted based on the first column, and for rows with equal values,
subsequent columns are used for further sorting.
- Example: SELECT first_name, last_name FROM employees ORDER BY
last_name, first_name;
Sorting by Expressions:
- By default, NULL values are considered the smallest in ascending order and the
largest in descending order.
- You can control the sorting behaviour of NULL values using the NULLS FIRST
or NULLS LAST options.
- Example: SELECT column_name FROM table_name ORDER BY column_name
NULLS LAST;
Sorting by Position:
- Instead of specifying column names, you can sort by column positions in the
ORDER BY clause.
- Example: SELECT product_name, price FROM products ORDER BY 2 DESC,
1 ASC;
● GROUP BY
The GROUP BY clause in SQL is used to group rows from a table based on one or
more columns.
Syntax:
8
- The GROUP BY clause follows the SELECT statement and is used to group
rows based on specified columns.
- Aggregation Functions:
○ Aggregation functions (e.g., COUNT, SUM, AVG, MAX, MIN) are often
used with GROUP BY to calculate values for each group.
○ Example: SELECT department, AVG(salary) FROM employees GROUP
BY department;
- Grouping by Multiple Columns:
○ You can group by multiple columns by listing them in the GROUP BY clause.
○ This creates a hierarchical grouping based on the specified columns.
○ Example: SELECT department, gender, AVG(salary) FROM
employees GROUP BY department, gender;
- HAVING Clause:
○ You can use both GROUP BY and ORDER BY in the same query to
control the order of grouped results.
○ Example: SELECT department, COUNT(*) FROM employees GROUP
BY department ORDER BY COUNT(*) DESC;
● AGGREGATE FUNCTIONS
These are used to perform calculations on groups of rows or entire result sets. They provide
insights into data by summarising and processing information.
- COUNT():
Counts the number of rows in a group or result set.
- SUM():
Calculates the sum of numeric values in a group or result set.
- AVG():
9
Computes the average of numeric values in a group or result set.
- MAX():
Finds the maximum value in a group or result set.
● INSERT:
● UPDATE:
● DELETE:
10
Data Control Language focuses on the management of access rights, permissions,
and security-related aspects of a database system.
DCL commands are used to control who can access the data, modify the data, or
DCL is an important aspect of database security, ensuring that data remains protected
and only authorised users have the necessary privileges.
There are two main DCL commands in SQL: GRANT and REVOKE.
1. GRANT:
The GRANT command is used to provide specific privileges or permissions to users or roles.
Privileges can include the ability to perform various actions on tables, views, procedures,
and other database objects.
Syntax:
GRANT privilege_type
ON object_name
TO user_or_role;
In this syntax:
2. REVOKE:
Syntax:
REVOKE privilege_type
ON object_name
11
FROM user_or_role;
In this syntax:
Example: Revoking the SELECT privilege on the "Employees" table from the "Analyst" user:
DCL plays a crucial role in ensuring the security and integrity of a database system.
By controlling access and permissions, DCL helps prevent unauthorised users from
tampering with or accessing sensitive data. Proper use of GRANT and REVOKE
commands ensures that only users who require specific privileges can perform certain
actions on database objects.
Transaction Control Language (TCL) deals with the management of transactions within
a database.
TCL commands are used to control the initiation, execution, and termination of
transactions, which are sequences of one or more SQL statements that are executed as a
single unit of work.
Transactions ensure data consistency, integrity, and reliability in a database by
grouping related operations together and either committing or rolling back changes
based on the success or failure of those operations.
There are three main TCL commands in SQL: COMMIT, ROLLBACK, and SAVEPOINT.
1. COMMIT:
The COMMIT command is used to permanently save the changes made during
a transaction.
12
It makes all the changes applied to the database since the last COMMIT or
ROLLBACK command permanent.
Once a COMMIT is executed, the transaction is considered successful, and the changes
UPDATE Employees
SET Salary = Salary * 1.10
WHERE Department = 'Sales';
COMMIT;
2. ROLLBACK:
BEGIN;
UPDATE Inventory
SET Quantity = Quantity - 10
WHERE ProductID = 101;
ROLLBACK;
3. SAVEPOINT:
The SAVEPOINT command creates a named point within a transaction, allowing you to set
a point to which you can later ROLLBACK if needed.
SAVEPOINTs are useful when you want to undo part of a transaction while preserving
other changes.
BEGIN;
13
UPDATE Accounts
SET Balance = Balance - 100
WHERE AccountID = 123;
UPDATE Accounts
SET Balance = Balance + 100
WHERE AccountID = 456;
TO before_withdrawal;
COMMIT;
Transaction Control Language (TCL) commands are vital for managing the integrity
and consistency of a database's data.
They allow you to group related changes into transactions, and in the event of errors,
either commit those changes or roll them back to maintain data integrity.
TCL commands are used in combination with Data Manipulation Language (DML) and other
SQL commands to ensure that the database remains in a reliable state despite unforeseen
errors or issues.
JOINS
In a DBMS, a join is an operation that combines rows from two or more tables based on
a related column between them.
Joins are used to retrieve data from multiple tables by linking them together using a
common key or column.
Types of Joins:
14
1. Inner Join
2. Outer Join
3. Cross Join
4. Self Join
An inner join combines data from two or more tables based on a specified condition,
known as the join condition.
The result of an inner join includes only the rows where the join condition is met in
all participating tables.
It essentially filters out non-matching rows and returns only the rows that have
matching values in both tables.
Syntax:
SELECT columns
FROM table1
INNER JOIN table2
ON table1.column = table2.column;
Here:
● columns refers to the specific columns you want to retrieve from the tables.
● table1 and table2 are the names of the tables you are joining.
● column is the common column used to match rows between the tables.
● The ON clause specifies the join condition, where you define how the tables
are related.
Customers Table:
CustomerID CustomerName
1 Alice
2 Bob
3 Carol
Orders Table:
103 2 Headphones
Result:
CustomerName Product
Alice Laptop
Bob Headphones
Carol Smartphone
2) Outer Join
Outer joins combine data from two or more tables based on a specified condition, just
like inner joins. However, unlike inner joins, outer joins also include rows that do not have
matching values in both tables.
Outer joins are particularly useful when you want to include data from one table even if
there is no corresponding match in the other table.
Types:
There are three types of outer joins: left outer join, right outer join, and full outer join.
A left outer join returns all the rows from the left table and the matching rows from the right
table.
16
If there is no match in the right table, the result will still include the left table's row with
NULL values in the right table's columns.
Example:
Result:
CustomerName Product
Alice Laptop
Bob Headphones
Carol Smartphone
NULL Monitor
In this example, the left outer join includes all rows from the Customers table.
Since there is no matching customer for the order with OrderID 103 (Monitor), the
result includes a row with NULL values in the CustomerName column.
A right outer join is similar to a left outer join, but it returns all rows from the right table
and the matching rows from the left table.
If there is no match in the left table, the result will still include the right table's row with
NULL values in the left table's columns.
Result:
17
CustomerName Product
Carol Smartphone
Bob Headphones
NULL Keyboard
Here, the right outer join includes all rows from the Orders table. Since there is no matching
order for the customer with CustomerID 4, the result includes a row with NULL values in
the CustomerName column.
A full outer join returns all rows from both the left and right tables, including matches and
non-matches.
If there's no match, NULL values appear in columns from the table where there's
no corresponding value.
Result:
CustomerName Product
Alice Laptop
Bob Headphones
Carol Smartphone
18
NULL Monitor
In this full outer join example, all rows from both tables are included in the result. Both
non-matching rows from the Customers and Orders tables are represented with
NULL values.
3) Cross Join
A cross join, also known as a Cartesian product, is a type of join operation in a Database
Management System (DBMS) that combines every row from one table with every row
from another table.
Unlike other join types, a cross join does not require a specific condition to match rows
between the tables. Instead, it generates a result set that contains all possible
combinations of rows from both tables.
Cross joins can lead to a large result set, especially when the participating tables have
many rows.
Syntax:
SELECT columns
FROM table1
CROSS JOIN table2;
In this syntax:
● columns refers to the specific columns you want to retrieve from the cross-
joined tables.
● table1 and table2 are the names of the tables you want to combine using a
cross join.
Students Table:
StudentID StudentName
1 Alice
19
2 Bob
CourseID CourseName
101 Maths
102 Science
Result:
StudentName CourseName
Alice Maths
Alice Science
Bob Maths
Bob Science
In this example, the cross join between the Students and Courses tables generates all
possible combinations of rows from both tables. As a result, each student is paired with
each course, leading to a total of four rows in the result set.
4) Self Join
This technique is useful when a table contains hierarchical or related data and you need
to compare or analyse rows within the same table.
20
Self joins are commonly used to find relationships, hierarchies, or patterns within a
single table.
In a self join, you treat the table as if it were two separate tables, referring to them
Syntax:
SELECT columns
FROM table1 AS alias1
JOIN table1 AS alias2 ON alias1.column = alias2.column;
In this syntax:
● columns refers to the specific columns you want to retrieve from the self-joined table.
● table1 is the name of the table you're joining with itself.
● alias1 and alias2 are aliases you assign to the table instances for differentiation.
● column is the column you use as the join condition to link rows from the same table.
Example: Consider an Employees table that contains information about employees and
their managers.
Employees Table:
1 Alice 3
2 Bob 3
3 Carol NULL
4 David 1
Result:
Employee Manager
21
Alice Carol
David Alice
In this example, the self join is performed on the Employees table to find the relationship
between employees and their managers. The join condition connects the ManagerID
column in the e1 alias (representing employees) with the EmployeeID column in the e2 alias
(representing managers).
SET OPERATIONS
Set operations in SQL are used to combine or manipulate the result sets of multiple
SELECT queries.
They allow you to perform operations similar to those in set theory, such as union,
intersection, and difference, on the data retrieved from different tables or queries.
Set operations provide powerful tools for managing and manipulating data, enabling you
to analyse and combine information in various ways.
● UNION
● INTERSECT
● EXCEPT (or MINUS)
● UNION ALL
1. UNION:
The UNION operator combines the result sets of two or more SELECT queries into a
single result set.
It removes duplicates by default, meaning that if there are identical rows in the result sets,
only one instance of each row will appear in the final result.
Example:
22
Customers Table:
1 Alice
2 Bob
Suppliers Table:
SupplierID SupplierName
101 SupplierA
102 SupplierB
UNION Query:
Result:
CustomerName
Alice
Bob
SupplierA
SupplierB
2. INTERSECT:
The INTERSECT operator returns the common rows that exist in the result sets of two or
more SELECT queries.
23
Example: Using the same tables as before.
Result:
CustomerName
In this example, there are no common names between customers and suppliers, so
the result is an empty set.
The EXCEPT operator (also known as MINUS in some databases) returns the distinct
rows that are present in the result set of the first SELECT query but not in the result set of
the second SELECT query.
Result:
CustomerName
Alice
Bob
In this example, the names "Alice" and "Bob" are customers but not suppliers, so
they appear in the result set.
4. UNION ALL:
The UNION ALL operator performs the same function as the UNION operator but does
not remove duplicates from the result set. It simply concatenates all rows from the
different result sets.
24
Example: Using the same tables as before.
Result:
CustomerName
Alice
Bob
SupplierA
SupplierB
25
Tables that are related by
Data Source Result sets of SELECT queries.
duplicates. conditions.
26
Useful for combining and Used to retrieve and relate data
Result sets may have different Result sets can have different
Result Set
column names, but data types and column names, data types, and
Structure
counts must match. counts.
SUB QUERIES
Subqueries, also known as nested queries or inner queries, allow you to use the result
of one query (the inner query) as the input for another query (the outer query).
Subqueries are often used to retrieve data that will be used for filtering, comparison,
or calculation within the context of a larger query.
They are a way to break down complex tasks into smaller, manageable steps.
Syntax:
SELECT columns
27
FROM table
WHERE column OPERATOR (SELECT column FROM table WHERE condition);
In this syntax:
Products Table:
1 Laptop 1000
2 Smartphone 500
3 Headphones 50
Orders Table:
101 1 2
102 3 1
For Example: Retrieve the product names and quantities for orders with a total cost
greater than the average price of all products.
Result:
ProductName Quantity
28
Laptop 2
Not used for combining rows; Combines rows from different tables
Combining Rows
used to filter or evaluate data. based on specified join conditions.
29