UNIT 1: Database System Concepts & Architecture: o o o o o o
UNIT 1: Database System Concepts & Architecture: o o o o o o
UNIT 1: Database System Concepts & Architecture: o o o o o o
Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It provides a
different view and helps in achieving data independence which is used to enhance the security of data.
The database systems consist of complicated data structures and relations. For users to access the data
easily, these complications are kept hidden, and only the relevant part of the database is made accessible
to the users through data abstraction.
Database Language
o A DBMS has appropriate languages and interfaces to
express database queries and updates.
o Database languages can be used to read, store and
update the data in the database.
Types of Database Language
1. Data Definition Language
o DDL stands for Data Definition Language. It is used to define database structure or pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of tables and
schemas, their names, indexes, columns in each table, constraints, etc.
Here are some tasks that come under DDL:
o Create: It is used to create objects in the database.
o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.
2. Data Manipulation Language
DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a database.
It handles user requests.
Here are some tasks that come under DML:
o Select: It is used to retrieve data from a database.
o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
2
o Lock Table: It controls concurrency.
3. Data Control Language
o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature of rolling back.)
Here are some tasks that come under DCL:
o Grant: It is used to give user access privileges to a database.
o Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:
CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.
4. Transaction Control Language
TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.
Here are some tasks that come under TCL:
o Commit: It is used to save the transaction on the database.
o Rollback: It is used to restore the database to original since the last Commit.
Network Model
A network model is nothing but a generalization of the hierarchical data model as this data model allows
many to many relationships therefore in this model a record can also have more than one parent.
5
The network model can be represented as a graph and hence it replaces the hierarchical tree with a graph
in which object types are the nodes and relationships are the edges.
For example -
Here you can see all the three departments are linked with
the director which was not possible in the hierarchical data
model. In the network model, there can be many possible
paths to reach a node from the root node (College is the
root node in the above case), therefore the data can be
accessed efficiently when compared to the hierarchical
data model. But, on the other hand, the process of insertion
and deletion of data is quite complex.
Object-Oriented Data model
As suggested by its name, the object-oriented data model is a combination of object-oriented programming,
and relational data model. In this data model, the data and their relationship are represented in a single
structure which is known as an object. Since data is stored as objects, we can easily store audio, video,
images, etc in the database which was very difficult and inconvenient to do in the relational model. As
shown in the image below two objects are connected with each other through links.
Aggregation
Aggregation is a process that represent a relationship between a whole
object and its component parts.
It abstracts a relationship between objects and viewing the relationship
as an object.
It is a process when two entity is treated as a single entity.
1. Primary key
o It is the first key used to identify one and only one instance of an entity uniquely. An entity can
contain multiple keys, as we saw in the PERSON table. The key which is most suitable from those
lists becomes a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each employee. In the
EMPLOYEE table, we can even select License Number and Passport Number as primary keys
since they are also unique.
o For each entity, the primary key selection is based on requirements and developers.
8
2. Candidate key
o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key. The candidate
keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the attributes, like
SSN, Passport Number, License Number, etc., are considered a candidate key.
3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a candidate key.
4. Foreign key
o Foreign keys are the column of the table used to point to the primary key of another table.
o Every employee works in a specific department in a company, and employee and department are
two different entities. So we can't store the department's information in the employee table. That's
why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in the
EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
9
Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have to be
performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
Constraints that are applied in the data model is called Implicit constraints.
Constraints that are directly applied in the schemas of the data model, by specifying them in the DDL (Data
Definition Language). These are called as schema-based constraints or Explicit constraints.
Types of Integrity Constraint
1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc. The value of
the attribute must be available in the corresponding domain.
Example:
Authorization
Authorization is a privilege provided by the Database Administer. Users of the database can only view the
contents they are authorized to view. The rest of the database is out of bounds to them.
The different permissions for authorizations available are:
Primary Permission - This is granted to users publicly and directly.
Secondary Permission - This is granted to groups and automatically awarded to a user if he is a
member of the group.
Public Permission - This is publicly granted to all the users.
Context sensitive permission - This is related to sensitive content and only granted to a select
users.
The categories of authorization that can be given to users are:
System Administrator - This is the highest administrative authorization for a user. Users with this
authorization can also execute some database administrator commands such as restore or upgrade
a database.
System Control - This is the highest control authorization for a user. This allows maintenance
operations on the database but not direct access to data.
System Maintenance - This is the lower level of system control authority. It also allows users to
maintain the database but within a database manager instance.
System Monitor - Using this authority, the user can monitor the database and take snapshots of it.
Database Integrity
Data integrity in the database is the correctness, consistency and completeness of data. Data integrity is
enforced using the following three integrity constraints:
Entity Integrity - This is related to the concept of primary keys. All tables should have their own
primary keys which should uniquely identify a row and not be NULL.
11
Referential Integrity - This is related to the concept of foreign keys. A foreign key is a key of a
relation that is referred in another relation.
Domain Integrity - This means that there should be a defined domain for all the columns in a
database.
Availability
In order for an information system to be useful it must be available to authorized users. Availability
measures protect timely and uninterrupted access to the system. Some of the most fundamental threats to
availability are non-malicious in nature and include hardware failures, unscheduled software downtime and
network bandwidth issues. Malicious attacks include various forms of sabotage intended to cause harm to
an organization by denying users access to the information system.
The availability and responsiveness of a website is a high priority for many business. Disruption of website
availability for even a short time can lead to loss of revenue, customer dissatisfaction and reputation
damage. The Denial of Service (DoS) attack is a method frequently used by hackers to disrupt web
service. In a DoS attack, hackers flood a server with superfluous requests, overwhelming the server and
degrading service for legitimate users. Over the years, service providers have developed sophisticated
countermeasures for detecting and protecting against DoS attacks, but hackers also continue to gain in
sophistication and such attacks remain an ongoing concern.
Availability countermeasures to protect system availability are as far ranging as the threats to
availability. Systems that have a high requirement for continuous uptime should have significant hardware
redundancy with backup servers and data storage immediately available. For large, enterprise systems it is
common to have redundant systems in separate physical locations. Software tools should be in place to
monitor system performance and network traffic. Countermeasures to protect against DoS attacks include
firewalls and routers.
Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input and yields
instances of relations as output. It uses operators to perform queries. An operator can be either unary or
binary. They accept relations as their input and yield relations as their output. Relational algebra is
performed recursively on a relation and intermediate results are also considered relations.
The fundamental operations of relational algebra are as follows -
Select
Project
Union
Set different
Cartesian product
Rename
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that is, it tells
what to do but never explains how to do it.
Relational calculus exists in two forms -
Tuple Relational Calculus (TRC)
Domain Relational Calculus (DRC)
12
Following are some of the important differences between Relational Algebra and Relational Calculus.
Sr. No. Key Relational Algebra Relational Calculus
Objective Relational Algebra targets how to obtain Relational Calculus targets what result
2
the result. to obtain.
Order Relational Algebra specifies the order in Relational Calculus specifies no such
3
which operations are to be performed. order of executions for its operations.
Advantages of SQL
There are the following advantages of SQL:
High speed
Using the SQL queries, the user can quickly and efficiently retrieve a large amount of records from a
database.
No coding needed
In the standard SQL, it is very easy to manage the database system. It doesn't require a substantial amount
of code to manage the database system.
Well defined standards
Long established are used by the SQL databases that are being used by ISO and ANSI.
Portability
SQL can be used in laptop, PCs, server and even some mobile phones.
Interactive language
SQL is a domain language used to communicate with the database. It is also used to receive answers to
the complex questions in seconds.
Multiple data view
Using the SQL language, the users can make different views of the database structure.
Create Table
The CREATE TABLE statement is used to create a new table in a database. In that table, if you want to add
multiple columns, use the below syntax.
Syntax
1. CREATE TABLE table_name (
2. column1 datatype,
3. column2 datatype,
4. column3 datatype,
5. ....
6. );
The column parameters specify the names of the columns of the table.
The data type parameter specifies the type of data the column can hold (e.g. varchar, integer, date, etc.).
Create Table Example
1. CREATE TABLE Employee(
2. EmpId int,
3. LastName varchar(255),
4. FirstName varchar(255),
5. Address varchar(255),
6. City varchar(255)
7. );
The EmpId column is of type int and will hold an integer.
14
The LastName, FirstName, Address, and City columns are of type varchar and will hold characters and the
maximum length for these fields is 255 characters.
Update Table
The UPDATE statement is used to modify the existing records in a table.
Syntax
1. UPDATE table_name
2. SET column1 = value1, column2 = value2, ...
3. WHERE condition;
Example
1. UPDATE Employee
2. SET FirstName= 'KS', City= 'Pune'
3. WHERE EmpId= 1;
15
If the above query is executed then for EmpId= 1, "Firstname" and "City" column data will be
updated.
It is the WHERE clause that determines how many records will be updated.
1. UPDATE Employee
2. SET City='Pune'
Delete Statment in SQL
The DELETE statement is used to delete existing records in a table for a particular Record.
Syntax
1. DELETE FROM table_name WHERE condition;
Example
1. DELETE FROM Employee WHERE EmpId=1;
In Employee table EmpId = 1 record gets deleted.
Data redundancy occurs in a relational database when two or more rows or columns have the same value
or repetitive value leading to unnecessary utilization of the memory.
Student Table:
17