Info125 Notes1
Info125 Notes1
Info125 Notes1
And basically manipulate the dataset with all different forms of querying and updating.
DBMS is a general-purpose software system that facilitates the process of defining, constructing,
manipulating, and sharing databases among various users and applications.
DBMS provides efficient, reliable, convenient, and safe multi-user storage of, and access to data.
Data Model
Abstraction: suppressing details of data organization and storage, and highlighting the essential
features for an improved understanding of data.
Model: an abstraction of a real-world object or event
Useful in understanding complexities of the real-world environment
Example: Blue print vs actual product
Data abstraction - different users can look at the data at their desired level of detail
Data model - means to achieve data abstraction
Basic operations - insert, delete, modify, retrieve
Dynamic behaviour - the database designer can specify valid user-defined operations that are
allowed on the database objects.
Data model: a collection of concepts that can be used to describe the structure of a database.
Structure of a database: the data types, relationships, and constraints that apply to the data.
High-level (or conceptual) data models provide concepts that are close to the way
many users perceive data.
Example1: Entity-Relationship
They have Entity (eg.Star), Attribute(eg.colour) and relationship(eg.Is_on_label)
Example 2: Relational, Network and Hierarchal data models
Implementation (representational) data models provide concepts that fall between the
above two, used by commercial DBMS implementations (e.g. relational data models)
Low-level (or physical) models provide concepts that describe details of how data is
stored on the computer storage, such as disks.
These data models provide concepts to describe the details of how data is stored on the
computer storage (such as disk)
Examples: Record formats, record orderings, access path, index . . .
Database Schema
Schema:
database structure
data types
constraints on the database
Schema Diagram: An illustrative display of a database schema:
Schema Construct: a component (or an object) of the schema, such as STUDENT, COURSE.
Three-Schema Architecture
Schema can be defined at the 3 levels:
External schema describes the various user views. Typically
uses the same data model as the conceptual schema.
Conceptual schema describes the structure and constraints
for the whole database.
Typically uses conceptual data model.
Database State
Empty State: when the database has no data.
This is when we define a new database, and specify its database schema to the DBMS.
Initial State: when database is initially loaded with some data.
Every time the database is updated, we get another database state called current state
Valid State: is a state that satisfies the structure and constraints specified in the schema
DBMS refer to the schema whenever it needs to.
Database schema does not change frequently while Database state changes every time the
database is updated. DBMS stores the descriptions of the schema constructs and constraints (called
meta-data) in the DBMS catalog
Database Design
Requirements Collection and Analysis:
Data requirements: what kind of data is needed
o database designers interview users to understand their data requirements.
Functional requirements: what operations is performed
is a popular high-level conceptual data model, frequently used for the conceptual
design of database applications.
ER diagrams: diagrammatic notation associated with the Entity Relationship model.
Entity Relationship model describes the data by: focusing on mini-world, and not on
the database and not on the storage details
Main constructs in the ER model:
Entities represents a “thing” in the mini-world with an independent existence
Example: person, music, computer, address, employer, movie, product, student,
book, etc.
Relationships
Attributes : A property of an entity, entity type or a relationship type.
Example: name of a student, salary of an employee, color of a book, location of a
house, ...
Each attribute has a value set (or data type) such as integer, string, date, ...
Common Data Types
Name Syntax Description
Integer Integer 32-Bit signed integer values between -2^31 and 2^31
Double Double 64-Bit floating point values of approximate precision
Numeric Numeric(p,s A number with p digit before the decimal and s digitals after the decimal
)
Character Char(x) A textual string of the exact length x
Varying Character Varchar(x) A textual string of the maximum length x
Date Date Stores year, month, and day
Time time Stores hour, minute, and second values
Entity Type: a collection (or set) of entities that have the same attributes.
Attributes of an entity type are represented with ovals and are attached to their
entity type by straight lines.
Entity Set: the collection of all entities of a particular entity type in the database at a
point in time. Usually referred to using the same name as the entity type
Example: EMPLOYEE refers to both entity type and the set of all employee entities
(entity set).
Types of Attributes:
(a) Simple or Composite
Composite attribute can be composed of several components.
Example: FullName ( FirstName, LastName )
Simple attribute (atomic) cannot be divided into smaller parts.
Example: Social Security Number (SSN) or gender
Example: a street address can also be decomposed into street number, street name,
and apartment number
Multivalued attributes may have lower and upper bounds to constrain how many
values there can be assigned for an attribute
Example: an EMPLOYEE entity type can be identified with the key attribute of Ssn
(Social Security number).
Domain of Attribute
Value Sets (Domains) of Attributes specify which values the attributes can take
Example 1: Gender takes values from { female, male }
Example 2: Age can only take integers from 16 to 70
Data types: Boolean, integer, string, date, time, ...
Attribute A of Entity Set E whose Value Set is V can be defined as a Function
Function maps E to the Power Set P(V ) of V:
Mathematically: A : E -> P(V)
Power set is the set of all subsets
A is attribute
E is Entity set
P is Value Set of Attribute
The value of Attribute A for Entity e is A(e)
Example: name(e) = John
Relationship
In ER diagrams, Relationship Types are displayed as diamond-shaped boxes,
connected by straight lines to participating entity types.
Example: EMLPOYEE works_for DEPARTMENT
Cardinality Ratio
Main types of Constraints in binary relationship:
(a) Cardinality Ratio: specifies the max number of relationship instances that an entity can
participate.
example
example
DEPARTMENT : EMPLOYEE
is of cardinality ratio
Possible cardinality ratios for binary relationship types:
o 1:1 o 1:N
o N:1 o M:N
(min, max) notation: means that each entity e in the entity set E must participate in
at least min entries of the relationship, and at most max.
“1” Entity Type A can relate to “1..N” Entity Type B
Example: “1” DEPARTMENT can employ “1..N” EMPLOYEEs
Note: (min, max) notation has the values reversed compared to the 1-N relationship.
Note: indicates “no limit”
*
Min value of 1 indicates Total Participation
Participation Constraint
Main types of Constraints in binary relationship:
(b)Participation Constraint: specifies the minimum number of relationship instances
that an entity can participate in.
It is also called Minimum Cardinality Constraint
Participation constraints can be:
(a) Total (Existence dependency) is displayed as a double line connecting the
participating entity type to the relationship.
means that every entity in the total set of entities must participate in at least one
relationship in the relationship set. It specifies whether the existence of an entity
depends on its being related to another entity via the relationship type.
Example: if every EMPLOYEE must work for a DEPARTMENT, then,
participation of EMPLOYEE in works_for is a Total participation.
Key Attribute
Key Attribute: one or more attributes of an entity type whose values are distinct for
each individual entity in the entity set.
Key attribute are represented underlined inside oval.
Entity Type that has key attributes of its own is called Strong Entity Types.
Key attribute are represented underlined inside oval.
Entity Type that does not have key attributes of its own is called Weak Entity Types.
Weak Entity Type is represented by a double line box.
example:
Example: suppose CUSTOMER never receives two LOANs at the same Date. So Date can be
the partial key for LOAN.
Partial Key is underlined with a dashed line.
Owner (or identifying) Entity Type is an entity type that is related to Weak Entity
Types through combination of some of their attribute values.
Weak entity can not be identified without owner entity.
Ternary to Binary
Example: SUPPLY Ternary Relationship (s, j, p) Example: SUPPLY as Binary Relationships (s,j), (j,p),
(s,p)
Ternary relationship type represents different
information than do three binary relationship
types.
Inheritance Specialization
Entities in Subclass inherit attributes from Example: the process of defining an Army and
Superclass. Civilian subclasses of a Person superclass.
Subclass can have attributes that
superclass does not.
Specialization vs Generalisation
Specialization allows us to: Generalization: process of suppressing the
-define a set of subclasses of an entity type differences among several entity types, identify
-establish specific attributes for each subclass their common features, to generalize them into a
-establish specific relationship types between single superclass of them.
subclasses and other entity types -Generalization is a bottom-up conceptual
synthesis
-Generalization is the inverse process of the
Specialization
Example: CAR and TRUACK have several common attributes, such as Price, Vehicle_id, and
License_plate_no
Example: they can be generalized into a superclass entity type VEHICLE.
Example:
CAR, and TRUCK is the specialization of VEHICLE VEHICLE is the generalization of CAR and
TRUCK.
Notation
A relational schema R of degree n is R(A1 , ..., An)
Uppercase letters R, Q, S denote relation names
Lowercase letters q, r, s denote relation states
Lowercase letters t, u, v denote tuples
Relation (or relation state) r of the relation schema R(A1, ..., An) also denoted by
r(R), is a set of n-tuples
r = { t1, ..., tm }
Relational Model
Several attributes can have the same domain while attribute names indicate
different roles for domain.
Example: in a relation:
Home_numbers may refer to the role of Home_phone Office_phone may refer to
the office phone
Mobile_phone may refer to the role of mobile phone
Relation (or relation state) r of the relation schema R(A1, ..., An) also denoted by r(R), is a
set of n-tuples
Notation
Relation . Attribute
An n-tuple t in r(R) is denoted t=<v1,... , vn>
t[Ai] and t.Ai and t[i] refer to value vi for Ai in t
Schema-based constraints
(a) Domain constraints: specify that all attribute values must be atomic within each tuple
and obtained from their respective domain.
Data types associated with domains typically include standard numeric data types for
integers, characters, booleans, strings, date, time, timestamp, and even money, etc.
(b) Constraints on NULLs
(c)Entity integrity constraints
(d) Referential integrity constraints
(e)Key constraints: means that no two tuples can have the same combination of values for
all their attributes.
Relation is defined as a set of tuples, and all elements of a set are distinct. Hence, all tuples
in a relation must also be distinct.
Superkey (SK) is a selection of attributes that always have a unique combination of values
for all tuples.
Example
Superkey is a selection of attributes that always have a unique combination of values for all
tuples.
(e)Key constraints
Superkey: of the relation schema R(A1, ..., An) is a set {B1, ..., Bn} ⊆ {A1, ..., An}, such that
for any relational state r(R) we have that:
t[B1, ..., Bn] ≠ t’[B1, ..., Bn]
t1 [Ssn,Name] ≠ t2 [Ssn,Name] ≠ t3 [Ssn,Name]
Key (K) is a superkey when removing any attribute A from K leaves a set of attributes that is
not a superkey anymore.
If more keys exist, they are all called candidate keys. But the one that is used is called
Primary Key!
Key Constraints
It is common to designate one of the candidate keys as the Primary Key and
represented with underline.
This is the candidate key whose values are used to identify tuples in the relation.
Entity Integrity Constraint states that no Primary Key value can be NULL.
This is because the Primary Key value is used to identify individual tuples in a
relation and with a NULL value, we cannot identify some tuples.
Relational Database Schema S is a set of relational schemas S = {R1, R2, ..., Rn} and
a set of Integrity Constraints (IC).
Relational database State DB of S is a set of relational states DB={r1, r2, ..., rm} such
that ri is a state of Ri and ri relation states satisfy the Integrity Constraints (IC).
Referential integrity constraint states that a tuple in one relation that refers to
another relation must refer to an existing tuple in that relation (consistency)
Let R1 and R2 be relational schemas. Let FK be a selection of attributes from R1, and
let PK be the primary key in R2.
Operations:
Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, Insert <‘Alicia’, ‘J’, ‘Zelaya’, ‘999887777’, ‘1960-04-05’,
‘1960-04-05’, ‘6357 Windy, Katy, TX’, F, 28000, ‘6357 Windy Lane, Katy, TX’, F, 28000,
NULL, 4> into EMPLOYEE. ‘987654321’, 4> into EMPLOYEE.
Result: accepted. This insertion satisfies all Result: rejected. This insertion violates the key
constraints. constraint since another tuple with the same Ssn exists
Insert can violate: in the EMPLOYEE relation
1. Domain constraints when giving an attribute
value that is not in the domain for that attribute Insert can violate:
2. Key constraints when giving an already 3. Entity integrity: when any part of the primary key of
existing key value the inserted tuple is NULL
Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, NULL, ‘1960-04- Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-
05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, 05’, ‘6357 Windswept, Katy, TX’, F, 28000,
NULL, 4> into EMPLOYEE. ‘987654321’, 7> into EMPLOYEE.
Result: rejected. This is a violation of referential integrity
Result: rejected. This insertion violates entity constraint specified on Dno in EMPLOYEE as no
integrity constraint (NULL for the primary key referenced tuple exists in DEPARTMENT with
Ssn). Dnumber = 7
Insert can violate: Correction: DBMS could either ask the user to change
4. Referential integrity: when value of any the value of Dno to a valid value (or NULL), or to insert a
foreign key in inserted tuple does not exist in DEPARTMENT tuple with Dnumber = 7
referenced tuple If insertion violates a constraint, the default is to
reject.
But would be useful if the DBMS could provide a
reason to the user as to why the insertion was
rejected.
Another option is to attempt to correct the
reason for rejecting the insertion, but typically
not for violations caused by Insert
Delete
Delete is an operation that is used to destroy tuples from the table.
Delete the WORKS_ON tuple with Essn = Delete the EMPLOYEE tuple with Ssn = ‘999887777’.
‘999887777’ and Pno = 10. Result: not acceptable, because there are tuples in
Result: accepted. exactly one tuple is deleted. WORKS_ON that refer to this tuple. Hence, if the tuple
in EMPLOYEE is deleted, referential integrity violations
Delete can violate only referential integrity will result.
Example:
Possible responses from the DBMS: Acknowledge Correction: DBMS can automatically delete offending
Delete and leave database unchanged tuples from WORKS_ON with Essn = ‘999887777’.
Delete t2 and replace reference values in t1 with
NULL
Delete both t2 and t1 (and any tuples that
reference t1)
Update the Dno of the EMPLOYEE tuple with Ssn Update the Dno of the EMPLOYEE tuple with Ssn =
= ‘999887777’ to 1. ‘999887777’ to 7
Result: accepted Result: not acceptable. because it violates referential
Updating an attribute that is not part of a integrity.
primary key nor of a foreign key usually causes
no problems Update the Ssn of the EMPLOYEE tuple with Ssn =
Upon modifying a foreign key it must be verified ‘999887777’ to ‘987654321’.
that the new value refers to existing tuple or is Result: not acceptable. it violates primary key
set to NULL constraint by repeating a value that already exists as a
primary key in another tuple; it violates referential
DBMS needs to check and confirm that the new integrity constraints because there are other relations
value is of the correct data type and domain. that refer to the existing value of Ssn.
Modifying a primary key value is similar to
deleting one tuple and inserting another in its
place because we use the primary key to identify
tuples.
Types of Collections
Creating Domains
Numeric
Character-string
Bit-string
Boolean
Date
Timestamp
Interval
Database Design
(a)This lecture focuses on the logical database design process, which is also known as data
model mapping.
EER = ER + Subclass - Superclass
Specialization - Generalization
Category - Union type
mapping
(b) 2 steps: EER model constructs-- -> Relations
mapping
SELECT σ is used to chose a subset of the SELECT σ is used to chose a subset of the tuples from a
tuples from a relation that satisfy a selection relation that satisfy a selection condition
condition Example
σ (Employee)
Salary>30000
Equivalent to:
SELECT *
FROM EMPLOYEE
WHERE Salary>25000;
Question:
If |σc(R)| is the number of tuples in the resulting
relation, then: |σc(R)|≤|R|
PROJECT SELECT σ is used to retrieve rows
PROJECT π is used to chose a subset of the PROJECT π is used to retrieve columns
attribute values from all the tuples from a Nested SELECTs
relation σ <cond1>( σ<cond2>(---( σ<condn>(R))…)) = σ<cond1>
AND<cond2>AND…AND<condn>(R)
Union Operation
Union operators, same as in sets
Union: denoted R ∪ S, includes all the unique tuples that are in R, in S, or in both R
and S.
Use Operation in Relational Algebra to find SSN of employees in department 5 and their
supervisors
Answer
Definition of NATURAL JOIN requires that the two join attributes having the same name in
both relations.
If not the case, a renaming operation is applied first.
Additional Operations
Generalised Projection: extends the projection
operation by allowing functions of attributes to be
included in the projection list
Example: EMPLOYEE(Ssn, Salary, Deduction, Years_service)
Aggregate functions: used in basic statistical queries that aggregate date from the database
tuples . Example fig 2:
Good Design
Goals of the design activity: information preservation while minimum redundancy.
1st Normal Form 2nd Normal Form
All rows must be unique(no duplicates) Database must be in 1st Normal Form
Each cell must contain a single value (not a list) Non partial dependency - All non-prime attributes
Each value should be non divisible(can't be split should be fully functionally dependent on the
down further) candidate key
Example:
key for the relation is {Code, Manager} and the Functional
Dependencies are:
{Code, Manager} --> Type
Type --> Users
Users --> City
Code --> City, Type
Manager --> StartYear
3rd Normal Form Boyce-codd Normal Form
Database must be in 1st, and 2nd Normal Form Database must be in 1st, 2nd and 3rd Normal Form
No transitive dependency - All fields must only For any dependency A->B, A should be a super key
be determinable by the primary/composite key, A can not be a non prime attribute while B is prime
not by other keys attribute in A->B
non prime er avhengig av en annen non prime
er ikke 3rd normal form
mydb.commit()
Memory
Memory is used to store information within a computer, either programs or data.
Memory hierarchy by considering:
o Cost per storage unit
o Access speed
o Reliability
Memory hierarchy aims to keep the data that is:
o Most accessed on top (so it is accessible quickly)
o Least accessed at the bottom
Main Memory & Cache Memory refers as internal memory placed at the main
board. This memory communicates directly with CPU.
Secondary & Tertiary Memory refers as external memory (or auxiliary memory)
because it is not located at the main board (for back-up purpose).
Memory Hierarchy
Primary Storage level Main memory (DRAM: Dynamic RAM),
[cache] Cashe is a fast accessible data (static RAM) provides the main work area for the CPU
used by the CPU to speed up execution of for keeping program instructions data.
[main memory] program instruction. Advantage is its relative low cost, which
Advantage is that it is very fast memory continues to decrease Disadvantage is its
and needs least access time. volatility and lower speed compared with
Disadvantage is its very limited capacity cache
and high cost
Flash memory is high- density, and also Magnetic Disk is type of memory which is
[Flash Memory] high- performance memory, using a flat disc covered with magnetic coating,
electrically erasable programmable read- used to store programs or files
[Magnetic Disk] only memory technology (EEPROM). Advantage is that it is less expensive than
Advantage is the fast access speed RAM and can store large amounts of data,
Disadvantage is that an entire data block Disadvantage is that data access is slower
must be erased and written over than main memory
simultaneously
Optical Drives are a form of optical Magnetic Tapes are a type of magnetic
[Optical disk] removable storage, and example is DVD memory where one side of tape has
(Digital Versatile Disc). magnetic coated material and used
[Magnetic tapes] Advantage is is that it allows mass Advantage is very low cost that makes
replication them ideal for database backup
Disadvantage is the data is sequentially Disadvantage is that they are slow and
accessed. difficult to make updates on data
Memory Access
Random Access: every memory location can be accessed directly rather than accessed in
sequence.
Access time is independent of location or previous access.
Example: RAM
Sequential Access: start at the beginning and read through in order
Access time depends on location of data and previous location Example: tape
Double Buffer
o Double buffering: a technique whereby we can gain efficiency in terms of
performing the I/O operation between the disk and main memory.
o I/O operation is done in one buffer area concurrently while processing the data from
another buffer
o Using multiple buffers —> efficient performance
o Using multiple buffers can increase efficiency by allowing the CPU processes data in
one buffer while the disk transmits new data to another buffer.
o While one buffer is being read or written, the CPU can process data in the other
buffer.
o Done by an independent disk I/O processor can proceed to transfer a data block
between memory and disk in parallel to CPU processing.
Database Storage
o Databases typically store large amounts of data that must persist over long periods
of time, and hence the data is often referred to as persistent data.
o In contrasts, transient data persists for only a limited time during program
execution.
Records
o Data in a database is stored in the form of records.
o Record consists of a number of fields containing data values.
Files
A file is a sequence of records
o Fixed-length records file: every record in the file has exactly the same size in bytes
o Variable-length records file: the file records are of the same type, but one or more
of the fields are of varying size
Record Blocking
Spanned organisation: one record in several blocks
Unspanned organisation: no “spill-over” of records across blocks
Organising Files Methods for organizing records of a file retrieval & update is optimised:
Heap files : Sorted files :We can physically order the
o Simplest organization where records are records of a file based on the values of one of
placed in the file in the order in which they the fields (ordering field).
are inserted. This leads to an ordered file.
o Hence, the new records are inserted at the
end of file.
o As you may know, this organization is
called a Heap (Pile) file
Advantage:
Inserting a new record is efficient.
Disadvantage:
Searching for a record is expensive since it
If the ordering field is also a key field of the
involves a linear search through the file block by file, the field is guaranteed to have a unique
block.
value in each record, then the field is called
Deletion is expensive the ordering key.
Advantages:
o reading the records becomes efficient
o finding the next record from the
current one usually requires no
additional block accesses
o using a search condition based on the
value of the ordering key field results
in faster access with binary search
Disadvantages:
o no benefit when searching by non
ordering fields, inserting and
deleting is expensive (in time)
Open addressing :
checking the subsequent positions in
order until an empty slot is found
Multiple hashing :
a second (and third...) hash is applied to
the results of the first hash
Good Hash Function
Distribute records uniformly over the
address space to minimise collisions,
while not leaving many empty spaces
Hash table is better to be kept 70% to
90% full
Choose a prime number for m, it
distributes the hash addresses better
over the address space when hashing
function (mod) is used
External Hashing Dynamic Hashing
To suit the characteristics of disk storage, the target Hashing scheme described so far is called
address space is made of buckets, each of which holds static hashing because a fixed number of
multiple records. buckets m is allocated.
Hashing function maps a key into a relative bucket We are fixing the address space which can be
number rather than assigning an absolute block address a serious drawback for dynamic files.
to the bucket Example: if the number of records increases
Hashing for disk files is called external hashing to a lot, many collisions will result in and
Organising Files on Disk retrieval will be slowed down because of the
Cluster is a number of blocks that are consecutive long of overflow records.
on the storage medium. h(K) =K mod m
Bucket is either one disk block or a cluster of We need a way to make m a variable.
contiguous disk blocks. What if we take a h(K) not to produce the
number of the row in the table, but a binary
Extendible Hash Tables number.
d What if we use this binary number as the row
Extendible Hashing, uses an array of 2 bucket number!
addresses (called directory) with d factor (global Internal nodes: that have two
depth of directory).
Integer value corresponding to the first (high-
order) d bits of a hash value is used as an index to
the array to determine a directory entry pointers:
Address in that entry determines the bucket in
which the corresponding records are stored o left pointer corresponds to 0
Local depth d′ specifies the number of bits on bit (in hashed address)
which the bucket contents are based
When d = d′, the number of entries in the
directory doubles, if a bucket overflows. o right pointer corresponds to
Halving occurs if d > d′ for all the buckets after the 1 bit
some deletions occur.
Leaf nodes: that hold pointer to
SQL SELECT
SELECT column1, column2, ... selects the "CustomerName" and "City" columns from the
FROM table_name; "Customers" table:
SELECT CustomerName, City FROM Customers;
SELECT * FROM table_name; selects all the columns from the "Customers" table:
SELECT * FROM Customers;
SELECT DISTINCT column1, column2, ... selects only the DISTINCT values from the
FROM table_name; "Country" column in the "Customers" table:
SELECT DISTINCT Country FROM Customers;
Lists the number of different (distinct)
customer countries:
SELECT COUNT(DISTINCT Country) FROM
Customers;
ORDER BY Syntax
SELECT column1, column2, ... selects all customers from the "Customers" table,
FROM table_name sorted by the "Country" column:
ORDER BY column1, column2, ... ASC|DESC; SELECT * FROM Customers
ORDER BY Country;
1.Specify both the column names and the values 2.If you are adding values for all the columns of
to be inserted: the table, you do not need to specify the column
names in the SQL query. However, make sure
INSERT INTO table_name (column1, column2, the order of the values is in the same order as the
column3, ...) columns in the table. Here, the INSERT INTO
VALUES (value1, value2, value3, ...); syntax would be as follows:
INSERT INTO Customers (CustomerName, This will insert a new record, but only insert data
ContactName, Address, City, PostalCode, in the "CustomerName", "City", and "Country"
Country) columns (CustomerID will be updated
VALUES ('Cardinal', 'Tom B. Erichsen', 'Skagen automatically):
21', 'Stavanger', '4006', 'Norway');
INSERT INTO Customers (CustomerName,
City, Country)
VALUES ('Cardinal', 'Stavanger', 'Norway');
NULL VALUE
A field with a NULL value is a field with no value.If a field in a table is optional, it is
possible to insert a new record or update a record without adding a value to this field. Then,
the field will be saved with a NULL value.
It is not possible to test for NULL values with comparison operators, such as =, <, or <>.
We will have to use the IS NULL and IS NOT NULL operators instead.
SELECT column_names SELECT column_names
FROM table_name FROM table_name
WHERE column_name IS NULL; WHERE column_name IS NOT NULL;
lists all customers with a NULL value in the lists all customers with a value in the "Address"
"Address" field: field:
SELECT CustomerName, ContactName, SELECT CustomerName, ContactName, Address
Address FROM Customers
FROM Customers WHERE Address IS NOT NULL;
WHERE Address IS NULL;
UPDATE Customers
SET ContactName='Juan'
WHERE Country='Mexico';
SQL JOIN
(INNER) JOIN: Returns records that have matching values in both tables
LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from
the right table
RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from
the left table
FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table
SELECT column_name(s) selects all orders with customer information:
FROM table1 SELECT Orders.OrderID, Customers.CustomerName
INNER JOIN table2 FROM Orders
ON table1.column_name = table2.column_name; INNER JOIN Customers ON Orders.CustomerID =
Customers.CustomerID;
selects all orders with customer and shipper The LEFT JOIN keyword returns all records from
information: three tables the left table (table1), and the matching records from
SELECT Orders.OrderID, the right table (table2). The result is 0 records from
Customers.CustomerName, Shippers.ShipperName the right side, if there is no match.
FROM ((Orders
INNER JOIN Customers ON Orders.CustomerID = SELECT column_name(s)
Customers.CustomerID) FROM table1
INNER JOIN Shippers ON Orders.ShipperID = LEFT JOIN table2
Shippers.ShipperID); ON table1.column_name = table2.column_name;
select all customers, and any orders they might The RIGHT JOIN keyword returns all records from
have: the right table (table2), and the matching records
SELECT Customers.CustomerName, Orders.OrderID from the left table (table1). The result is 0 records
FROM Customers from the left side, if there is no match.
LEFT JOIN Orders ON Customers.CustomerID =
Orders.CustomerID SELECT column_name(s)
ORDER BY Customers.CustomerName; FROM table1
RIGHT JOIN table2
ON table1.column_name = table2.column_name;
will return all employees, and any orders they The FULL OUTER JOIN keyword returns all
might have placed: records when there is a match in left (table1) or right
SELECT Orders.OrderID, Employees.LastName, (table2) table records.
Employees.FirstName
FROM Orders SELECT column_name(s)
RIGHT JOIN Employees ON Orders.EmployeeID = FROM table1
Employees.EmployeeID FULL OUTER JOIN table2
ORDER BY Orders.OrderID; ON table1.column_name = table2.column_name
WHERE condition;
selects all customers, and all orders: A self join is a regular join, but the table is joined
SELECT Customers.CustomerName, Orders.OrderID with itself.
FROM Customers
FULL OUTER JOIN Orders ON SELECT column_name(s)
Customers.CustomerID=Orders.CustomerID FROM table1 T1, table1 T2
ORDER BY Customers.CustomerName; WHERE condition;
DATABASES
CREATE cann be used to create schemas, The DROP DATABASE statement is used to drop an
tables, and domains, etc. existing SQL database.
SQL Commands
SQL command that lists the databases that have been created: SHOW
DATABASES;