Dbms New
Dbms New
Dbms New
php
https://www.studytonight.com/dbms/overview-of-dbms.php
What is Data?
Data is nothing but facts and statistics stored or free flowing over a network, generally it's
raw and unprocessed. For example: When you visit any website, they might store you IP
address, that is data, in return they might add a cookie in your browser, marking you that
you visited the website, that is data, your name, it's data, your age, it's data.
Data becomes information when it is processed, turning it into something meaningful. Like,
based on the cookie data saved on user's browser, if a website can analyse that generally
men of age 20-25 visit us more, that is information, derived from the data collected.
What is a Database?
A Database is a collection of related data organised in a way that data can be easily
accessed, managed and updated. Database can be software based or hardware based,
with one sole purpose, storing data.
During early computer days, data was collected and stored on tapes, which were mostly
write-only, which means once data is stored on it, it can never be read again. They were
slow and bulky, and soon computer scientists realised that they needed a better solution to
this problem.
Larry Ellison, the co-founder of Oracle was amongst the first few, who realised the need
for a software based Database Management System.
What is DBMS?
A DBMS is a software that allows creation, definition and manipulation of database,
allowing users to store, process and analyse data easily. DBMS provides us with an
interface or a tool, to perform various operations like creating database, storing data in it,
updating data, creating tables in the database and a lot more.
DBMS also provides protection and security to the databases. It also maintains data
consistency in case of multiple users.
Here are some examples of popular DBMS used these days:
MySql
Oracle
SQL Server
IBM DB2
PostgreSQL
A DBMS makes it possible for end users to create, read, update and
delete data in a database. The DBMS essentially serves as an interface
between the database and end users or application programs, ensuring that
data is consistently organized and remains easily accessible.
The DBMS is perhaps most useful for providing a centralized view of data that
can be accessed by multiple users, from multiple locations, in a controlled
manner. A DBMS can limit what data the end user sees, as well as how that
end user can view the data, providing many views of a single database
schema. End users and software programs are free from having to
understand where the data is physically located or on what type of storage
media it resides because the DBMS handles all requests.
The DBMS can offer both logical and physical data independence. That
means it can protect users and applications from needing to know where data
is stored or having to be concerned about changes to the physical structure of
data (storage and hardware). As long as programs use the application
programming interface (API) for the database that is provided by the DBMS,
developers won't have to modify programs just because changes have been
made to the database.
Relational database management system (RDMS) - adaptable to most use cases, but
RDBMS Tier-1 products can be quite expensive.
NoSQL DBMS - well-suited for loosely defined data structures that may evolve over
time.
Advantages of a DBMS
Using a DBMS to store and manage data comes with advantages, but also overhead.
One of the biggest advantages of using a DBMS is that it lets end users and
application programmers access and use the same data while managing data integrity.
Data is better protected and maintained when it can be shared using a DBMS instead
of creating new iterations of the same data stored in new files for every new
application. The DBMS provides a central store of data that can be accessed by
multiple users in a controlled manner.
Data security
An efficient handler to balance the needs of multiple applications using the same
data
The ability to swiftly recover from crashes and errors, including restartability and
recoverability
A DBMS can also provide many views of a single database schema. A view defines
what data the user sees and how that user sees the data. The DBMS provides a level of
abstraction between the conceptual schema that defines the logical structure of the
database and the physical schema that describes the files, indexes and other physical
mechanisms used by the database. When a DBMS is used, systems can be modified
much more easily when business requirements change. New categories of data can be
added to the database without disrupting the existing system and applications can be
insulated from how data is structured and stored.
1. Data stored into Tables: Data is never directly stored into the database. Data is stored
into tables, created inside the database. DBMS also allows to have relationships
between tables which makes the data more meaningful and connected. You can easily
understand what type of data is stored where by looking at all the tables created in a
database.
2. Reduced Redundancy: In the modern world hard drives are very cheap, but earlier
when hard drives were too expensive, unnecessary repetition of data in database was a
big problem. But DBMS follows Normalisation which divides the data in such a way
that repetition is minimum.
3. Data Consistency: On Live data, i.e. data that is being continuosly updated and added,
maintaining the consistency of data can become a challenge. But DBMS handles it all
by itself.
4. Support Multiple user and Concurrent Access: DBMS allows multiple users to work
on it(update, insert, delete data) at the same time and still manages to maintain the data
consistency.
5. Query Language: DBMS provides users with a simple Query language, using which
data can be easily fetched, inserted, deleted and updated in a database.
6. Security: The DBMS also takes care of the security of data, protecting the data from
un-authorised access. In a typical DBMS, we can create user accounts with different
access permissions, using which we can easily secure our data by restricting user
access.
7. DBMS supports transactions, which allows us to better handle and manage data
integrity in real world applications where multi-threading is extensively used.
Data Isolation: Because data are scattered in various files, and files may be
in different formats, writing new application programs to retrieve the
appropriate data is difficult.
Duplication of data – Redundant data
Dependency on application programs – Changing files would lead to
change in application programs
Components of DBMS
The database management system can be divided into five major components, they are:
1. Hardware
2. Software
3. Data
4. Procedures
5. Database Access Language
Let's have a simple diagram to see how they all fit together to form a database management
system.
DBMS Components: Hardware
When we say Hardware, we mean computer, hard disks, I/O channels for data, and any
other physical component involved before any data is successfully stored into the memory.
When we run Oracle or MySQL on our personal computer, then our computer's Hard Disk,
our Keyboard using which we type in all the commands, our computer's RAM, ROM all
become a part of the DBMS hardware.
Users
Database Administrators: Database Administrator or DBA is the one who manages
the complete database management system. DBA takes care of the security of the
DBMS, it's availability, managing the license keys, managing user accounts and access
etc.
Application Programmer or Software Developer: This user group is involved in
developing and desiging the parts of DBMS.
End User: These days all the modern applications, web or mobile, store user data. How
do you think they do it? Yes, applications are programmed in such a way that they
collect user data and store the data on DBMS systems running on their server. End
users are the one who store, retrieve, update and delete data.
Database Languages
BY DINESH THAKUR Category: DBMS & RDBMS
A DBMS must provide appropriate languages and interfaces for each category of users to
express database queries and updates. Database Languages are used to create and
maintain database on computer. There are large numbers of database languages like
Oracle, MySQL, MS Access, dBase, FoxPro etc. SQL statements commonly used in
Oracle and MS Access can be categorized as data definition language (DDL), data
control language (DCL) and data manipulation language (DML).
In practice, the data definition and data manipulation languages are not two separate
languages. Instead they simply form parts of a single database language such as
Structured Query Language (SQL). SQL represents combination of DDL and DML, as
well as statements for constraints specification and schema evaluation.
Understanding DBMS
Architecture
A Database Management system is not always directly available for users and applications
to access and store data in it. A Database Management system can be centralised(all the
data stored at one location), decentralised(multiple copies of database at different
locations) or hierarchical, depending upon its architecture.
1-tier DBMS architecture also exist, this is when the database is directly available to the
user for using it to store data. Generally such a setup is used for local application
development, where programmers communicate directly with the database for quick
response.
Database Architecture is logically of two types:
1. 2-tier DBMS architecture
2. 3-tier DBMS architecture
Such an architecture provides the DBMS extra security as it is not exposed to the End User
directly. Also, security can be improved by adding security and authentication checks in the
Application layer too.
With Examples
If you are interested in learning about Instances,Schema and Subschema in DBMS? You
are absolutely at the right place.
Instances in DBMS
In simple words, it is the snapshot of the database taken at a particular moment. It can
also be described in more significant way as the collection of the information stored in
the database at that particular moment. Instance can also be called as the database state
or current set of occurrence due the fact that it is information that is present at the
current state.
Every time we update the state say we insert, delete or modify the value of the data item
in the record, it changes from one state to other. At the given time, each schema has its
own set of instances.
Lets take an example to understand in a much better way,
An organization with an employees database will have three different instances such as
production that is used to monitor the data right at that moment, per-production that is
used to test new functionality prior to release of production and the development that is
used by database developers to create new functionality.
Schema in DBMS
Definition of schema: Design of a database is called the schema. Schema is of
three types: Physical schema, logical schema and view schema.
DISCOUNTS
The former example shows the schema for stores displaying the name of the store, store
id,address,city and state in which it is located and the zip code of respective location.
The latter example is all about schema of discounts that clearly shows the type,id and
quality,thus we can now relate to the fact that schema only displays the record types
(entities) and names of data items(attributes) but does not show the relation among
the files.
Schema can be partitioned as logical schema and physical schema.
Look at the below diagram
Here,former part shows the logical schema which is concerned with the data structure
with exploring data structure offered to DBMS so that schema is very easy for the
computer to understand.
The latter part that is the physical schema is concerned with the way or the manner in
which conceptual database gets represented in the computer as it is stored in the
database.Physical schema is hidden behind the logical schema and thus can be be
modified without affecting the application programs
Database management system provides data definition language(DDL) and document
schema definition language(DSDL) to specify both logical and physical schema.
Hierarchical Model
Network Model
Entity-relationship Model
Relational Model
Hierarchical Model
This database model organises data into a tree-like-structure, with a single root, to which all
the other data is linked. The heirarchy starts from the Root data, and expands like a tree,
adding child nodes to the parent nodes.
In this model, a child node will only have a single parent node.
This model efficiently describes many real-world relationships like index of a book, recipes
etc.
In hierarchical model, data is organised into tree-like structure with one one-to-many
relationship between two different types of data, for example, one department can have
many courses, many professors and of-course many students.
Network Model
This is an extension of the Hierarchical model. In this model data is organised more like a
graph, and are allowed to have more than one parent node.
In this database model data is more related as more relationships are established in this
database model. Also, as the data is more related, hence accessing the data is also easier
and fast. This database model was used to map many-to-many data relationships.
This was the most widely used database model, before Relational Model was introduced.
Entity-relationship Model
In this database model, relationships are created by dividing object of interest into entity and
its characteristics into attributes.
Different entities are related using relationships.
E-R Models are defined to represent the relationships into pictorial form to make it easier for
different stakeholders to understand.
This model is good to design a database, which can then be turned into tables in relational
model(explained below).
Let's take an example, If we have to design a School Database, then Student will be
an entity with attributes name, age, address etc. As Address is generally complex, it can
be another entity with attributes street name, pincode, city etc, and there will be a
relationship between them.
Relationships can also be of different types. To learn about E-R Diagrams in details, click
on the link.
Relational Model
In this model, data is organised in two-dimensional tables and the relationship is
maintained by storing a common field.
This model was introduced by E.F Codd in 1970, and since then it has been the most widely
used database model, infact, we can say the only database model used around the world.
The basic structure of data in the relational model is tables. All the information related to a
particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model.
In the coming tutorials we will learn how to design tables, normalize them to reduce data
redundancy and how to use Structured Query language to access data from tables.
Let's take an example to explain everything. For a School Management Software, we will
have to store Student information, Teacher information, Classes, Subjects taught in each
class etc.
ER Model: Attributes
If a Student is an Entity, then student's roll no., student's name, student's age,
student's gender etc will be its attributes.
An attribute can be of many types, here are different types of attributes defined in ER
database model:
1. Simple attribute: The attributes with values that are atomic and cannot be broken down
further are simple attributes. For example, student's age.
2. Composite attribute: A composite attribute is made up of more than one simple
attribute. For example, student's address will contain, house no., street
name, pincode etc.
3. Derived attribute: These are the attributes which are not present in the whole database
management system, but are derived using other attributes. For example, average age
of students in a class.
4. Single-valued attribute: As the name suggests, they have a single value.
5. Multi-valued attribute: And, they can have multiple values.
ER Model: Keys
If the attribute roll no. can uniquely identify a student entity, amongst all the students, then
the attribute roll no. will be said to be a key.
Following are the types of Keys:
1. Super Key
2. Candidate Key
3. Primary Key
ER Model: Relationships
When an Entity is related to another Entity, they are said to have a relationship. For
example, A ClassEntity is related to Student entity, becasue students study in classes,
hence this is a relationship.
Depending upon the number of entities involved, a degree is assigned to relationships.
For example, if 2 entities are involved, it is said to be Binary relationship, if 3 entities are
involved, it is said to be Ternary relationship, and so on.
In the next tutorial, we will learn how to create ER diagrams and design databases using ER
diagrams.
Weak Entity
A weak Entity is represented using double rectangular boxes. It is generally connected to
another entity.
ER Diagram: Entity
An Entity can be any object, place, person or class. In ER Diagram, an entity is
represented using rectangles. Consider an example of an Organisation- Employee,
Manager, Department, Product and many more can be taken as entities in an Organisation.
The yellow rhombus in between represents a relationship.
ER Diagram: Relationship
A Relationship describes relation between entities. Relationship is represented using
diamonds or rhombus.
There are three types of relationship that exist between Entities.
1. Binary Relationship
2. Recursive Relationship
3. Ternary Relationship
The above diagram represents that one student can enroll for more than one courses. And
a course can have more than 1 student enrolled in it.
ER Diagram: Recursive Relationship
When an Entity is related with itself it is known as Recursive Relationship.
1. Generalization
2. Specialization
3. Aggregration
Let's understand what they are, and why were they added to the existing ER Model.
Generalization
Generalization is a bottom-up approach in which two lower level entities combine to form a
higher level entity. In generalization, the higher level entity can also combine with other
lower level entities to make further higher level entity.
It's more like Superclass and Subclass system, but the only difference is the approach,
which is bottom-up. Hence, entities are combined to form a more generalised entity, in other
words, sub-classes are combined to form a super-class.
Specialization
Specialization is opposite to Generalization. It is a top-down approach in which one higher
level entity can be broken down into two lower level entity. In specialization, a higher level
entity may not have any lower-level entity sets, it's possible.
Aggregration
Aggregration is a process when relation between two entities is treated as a single entity.
In the diagram above, the relationship between Center and Course together, is acting as
an Entity, which is in relationship with another entity Visitor. Now in real world, if a Visitor or
a Student visits a Coaching Center, he/she will never enquire about the center only or just
about the course, rather he/she will ask enquire about both.
Rule zero
This rule states that for a system to qualify as an RDBMS, it must be able to manage
database entirely through the relational capabilities.
Rule 1: Information rule
All information(including metadata) is to be represented as stored data in cells of tables.
The rows and columns have to be strictly unordered.
1 Adam 34 13000
2 Alex 28 15000
3 Stuart 20 18000
4 Ross 42 19020
1 Adam 34 13000
Attribute Domain
When an attribute is defined in a relation(table), it is defined to hold only a certain type of
values, which is known as Attribute Domain.
Hence, the attribute Name will hold the name of employee for every tuple. If we save
employee's address there, it will be violation of the Relational database model.
Name
Adam
Alex
Ross
What is a Relation Schema?
A relation schema describes the structure of the relation, with the name of the relation(name
of table), its attributes and their names and type.
1. Key Constraints
2. Domain Constraints
3. Referential integrity Constraints
Key Constraints
We store data in tables, to later access it whenever required. In every table one or more
than one attributes together are used to fetch data from tables. The Key
Constraint specifies that there should be such an attribute(column) in a relation(table),
which can be used to fetch data for any tuple(row).
The Key attribute should never be NULL or same for two different row of data.
For example, in the Employee table we can use the attribute ID to fetch data for each of the
employee. No value of ID is null and it is unique for every row, hence it can be our Key
attribute.
Domain Constraint
Domain constraints refers to the rules defined for the values that can be stored for a certain
attribute.
Like we explained above, we cannot store Address of employee in the column for Name.
Similarly, a mobile number cannot exceed 10 digits.
1. Select
2. Project
3. Union
4. Set Different
5. Cartesian product
6. Rename
Where, σ represents the Select Predicate, r is the name of relation(table name in which you
want to look for data), and p is the prepositional logic, where we specify the conditions that
must be satisfied by the data. In prepositional logic, one can
use unary and binary operators like =, <, > etc, to specify the conditions.
Let's take an example of the Student table we specified above in the Introduction of
relational algebra, and fetch data for students with age more than 17.
σage > 17 (Student)
This will fetch the tuples(rows) from table Student, for which age will be greater than 17.
You can also use, and, or etc operators, to specify two conditions, for example,
σage > 17 and gender = 'Male' (Student)
This will return tuples(rows) from table Student with information of male students, of age
more than 17.(Consider the Student table has an attribute Gender too.)
Project Operation (∏)
Project operation is used to project only a certain set of attributes of a relation. In simple
words, If you want to see only the names all of the students in the Student table, then you
can use Project Operation.
It will only project or show the columns or attributes asked for, and will also remove
duplicate data from the columns.
Syntax: ∏ (r)
A1, A2...
Above statement will show us only the Name and Age columns for all the rows of data
in Studenttable.
Above operation will give us name of Students who are attending both regular classes and
extra classes, eliminating repetition.
Apart from these common operations Relational Algebra is also used for Join operations
like,
Natural Join
Outer Join
Theta join etc.
Putting it all together, if we want to use Tuple Relational Calculus to fetch names of
students, from table Student, with age greater than 17, then, for T being our tuple variable,
T.name | Student(T) AND T.age > 17
Domain Relational Calculus (DRC)
In domain relational calculus, filtering is done based on the domain of the attributes and not
based on the tuple values.
Syntax: { c1, c2, c3, ..., cn | F(c1, c2, c3, ... ,cn)}
where, c1, c2... etc represents domain of attributes(columns) and F defines the formula
including the condition for fetching the data.
For example,
{< name, age > | ∈ Student ∧ age > 17}
Again, the above query will return the names and ages of the students in the
table Student who are older than 17.
Points to Remember
Similarly we can generate relational database schema using the ER diagram. Following are
some key points to keep in mind while doing so:
1. Entity gets converted into Table, with all the attributes becoming fields(columns) in the
table.
2. Relationship between entities is also converted into table with primary keys of the
related entities also stored in it as foreign keys.
3. Primary Keys should be properly set.
4. For any relationship of Weak Entity, if primary key of any other entity is included in a
table, foriegn key constraint must be defined.
ntroduction to Database Keys
Keys are very important part of Relational database model. They are used to establish and
identify relationships between tables and also to uniquely identify any record or row of data
inside a table.
A Key can be a single attribute or a group of attributes, where the combination may act as a
key.
1 Akon 9876723452 17
2 Akon 9991165674 19
3 Bkon 7898756543 18
4 Ckon 8987867898 19
5 Dkon 9990080080 17
Confused? The first one is pretty simple as student_id is unique for every row of data,
hence it can be used to identity each row uniquely.
Next comes, (student_id, name), now name of two students can be same, but
their student_idcan't be same hence this combination can also be a key.
Similarly, phone number for every student will be unique, hence again, phone can also be a
key.
So they all are super keys.
Candidate Key
Candidate keys are defined as the minimal set of fields which can uniquely identify each
record in a table. It is an attribute or a set of attributes that can act as a Primary Key for a
table to uniquely identify each record in that table. There can be more than one candidate
key.
In our example, student_id and phone both are candidate keys for table Student.
A candiate key can never be NULL or empty. And its value should be unique.
There can be more than one candidate keys for a table.
A candidate key can be a combination of more than one columns(attributes).
Primary Key
Primary key is a candidate key that is most appropriate to become the main key for any
table. It is a key that can uniquely identify each record in a table.
For the table Student we can make the student_id column as the primary key.
Composite Key
Key that consists of two or more attributes that uniquely identify any record in a table is
called Composite key. But the attributes which together form the Composite key are not a
key independentely or individually.
In the above picture we have a Score table which stores the marks scored by a student in a
particular subject.
In this table student_id and subject_id together will form the primary key, hence it is a
composite key.
Non-key Attributes
Non-key attributes are the attributes or fields of a table, other than candidate
key attributes/fields in a table.
Non-prime Attributes
Non-prime Attributes are attributes other than Primary Key attribute(s)..
Normalization of Database
Database Normalization is a technique of organizing the data in the database.
Normalization is a systematic approach of decomposing tables to eliminate data
redundancy(repetition) and undesirable characteristics like Insertion, Update and Deletion
Anamolies. It is a multi-step process that puts data into tabular form, removing duplicated
data from the relation tables.
Normalization is used for mainly two purposes,
The video below will give you a good overview of Database Normalization. If you want you
can skip the video, as the concept is covered in detail, below the video.
In the table above, we have data of 4 Computer Sci. students. As we can see, data for the
fields branch, hod(Head of Department) and office_tel is repeated for the students who are
in the same branch in the college, this is Data Redundancy.
Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data of the
student cannot be inserted, or else we will have to set the branch information as NULL.
Also, if we have to insert data of 100 students of same branch, then the branch information
will be repeated for all those 100 students.
These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? or is no longer the HOD of computer science department?
In that case all the student records will have to be updated, and if by mistake we miss any
record, it will lead to data inconsistency. This is Updation anomaly.
Deletion Anomaly
In our Student table, two different informations are kept together, Student information and
Branch information. Hence, at the end of the academic year, if student records are deleted,
we will also lose the branch information. This is Deletion anomaly.
Normalization Rule
Normalization rules are divided into the following normal forms:
In the next tutorial, we will discuss about the First Normal Form in details.
To understand what is Partial Dependency and how to normalize a table to 2nd normal for,
jump to the Second Normal Form tutorial.
Here is the Third Normal Form tutorial. But we suggest you to first study about the second
normal form and then head over to the third normal form.
To learn about BCNF in detail with a very easy to understand example, head to Boye-Codd
Normal Form tutorial.
Here is the Fourth Normal Form tutorial. But we suggest you to understand other normal
forms before you head over to the fourth normal form
Our table already satisfies 3 rules out of the 4 rules, as all our column names are unique,
we have stored data in the order we wanted to and we have not inter-mixed different type of
data in columns.
But out of the 3 different students in our table, 2 have opted for more than 1 subject. And we
have stored the subject names in a single column. But as per the 1st Normal form each
column must contain atomic value.
101 Akon OS
101 Akon CN
102 Bkon C
By doing so, although a few values are getting repeated but values for the subject column
are now atomic for each record/row.
Using the First Normal Form, data redundancy increases, as there will be many columns
with same data in multiple rows but each row as a whole will be unique.
What is Second Normal Form?
For a table to be in the Second Normal Form, it must satisfy two conditions:
What is Partial Dependency? Do not worry about it. First let's understand what
is Dependency in a table?
What is Dependency?
Let's take an example of a Student table with columns student_id, name, reg_no(registration
number), branch and address(student's home address).
In this table, student_id is the primary key and will be unique for every row, hence we can
use student_id to fetch any row of data from this table
Even for a case, where student names are same, if we know the student_id we can easily
fetch the correct record.
subject_id subject_name
1 Java
2 C++
3 Php
1 10 1 70 Java Teacher
2 10 2 75 C++ Teacher
3 11 1 80 Java Teacher
In the score table we are saving the student_id to know which student's marks are these
and subject_id to know for which subject the marks are for.
Together, student_id + subject_id forms a Candidate Key(learn about Database Keys) for
this table, which can be the Primary key.
Confused, How this combination can be a primary key?
See, if I ask you to get me marks of student with student_id 10, can you get it from this
table? No, because you don't know for which subject. And if I give you subject_id, you
would not know for which student. Hence we need student_id + subject_id to uniquely
identify any row.
And our Score table is now in the second normal form, with no partial dependency.
2 10 2 75
3 11 1 80
Quick Recap
1. For a table to be in the Second Normal form, it should be in the First Normal form and it
should not have Partial Dependency.
2. Partial Dependency exists, when for a composite primary key, any attribute in the table
depends only on a part of the primary key and not on the complete primary key.
3. To remove Partial dependency, we can divide the table, remove the attribute which is
causing partial dependency, and move it to some other table where it fits in well.
Score Table
1 10 1 70
2 10 2 75
3 11 1 80
In the Score table, we need to store some more information, which is the exam name and
total marks, so let's add 2 more columns to the Score table.
1 Workshop 200
2 Mains 70
3 Practicals 30
103 C# P.Chash
As you can see, we have also added some sample data to the table.
In the table above:
One student can enrol for multiple subjects. For example, student with student_id 101, has opted
And, there can be multiple professors teaching one subject like we have for Java.
student_id p_id
101 1
101 2
and so on...
And, Professor Table
p_id professor subject
1 P.Java Java
2 P.Cpp C++
and so on...
And now, this relation satisfy Boyce-Codd Normal Form. In the next tutorial we will learn about
the Fourth Normal Form.
1. For a dependency A → B, if for a single value of A, multiple value of B exists, then the
table may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued dependency.
3. And, for a relation R(A,B,C), if there is a multi-valued dependency between, A and B,
then B and C should be independent of each other.
If all these conditions are true for any relation(table), it is said to have multi-valued
dependency.
Time for an Example
Below we have a college enrolment table with columns s_id, course and hobby.
1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 Php Hockey
As you can see in the table above, student with s_id 1 has opted for two
courses, Science and Maths, and has two hobbies, Cricket and Hockey.
You must be thinking what problem this can lead to, right?
Well the two records for student with s_id 1, will give rise to two more records, as shown
below, because for one student, two hobbies exists, hence along with both the courses,
these hobbies should be specified.
1 Science Cricket
1 Maths Hockey
1 Science Hockey
1 Maths Cricket
s_id course
1 Science
1 Maths
2 C#
2 Php
And, Hobbies Table,
s_id hobby
1 Cricket
1 Hockey
2 Cricket
2 Hockey
ntroduction to SQL
Structure Query Language(SQL) is a database query language used for storing and
managing data in Relational DBMS. SQL was the first commercial language introduced for
E.F Codd's Relational model of database. Today almost all RDBMS(MySql, Oracle,
Infomix, Sybase, MS Access) use SQL as the standard database query language. SQL is
used to perform all types of data operations in RDBMS.
SQL Command
SQL defines following ways to manipulate data stored in an RDBMS.
Command Description
Command Description
Command Description
Command Description
Command Description
SQL: create command
create is a DDL SQL command used to create a table or a database in relational database
management system.
Creating a Database
To create a database in RDBMS, create command is used. Following is the syntax,
CREATE DATABASE <DB_NAME>;
Creating a Table
create command can also be used to create tables. Now when we create a table, we have
to specify the details of the columns of the tables too. We can specify
the names and datatypes of various columns in the create command itself.
Following is the syntax,
CREATE TABLE <TABLE_NAME>
(
column_name1 datatype1,
column_name2 datatype2,
column_name3 datatype3,
column_name4 datatype4
);
create table command will tell the database system to create a new table with the given
table name and column information.
Datatype Use
VARCHAR used for columns which will be used to store characters and integers, basically a
string.
CHAR used for columns which will store char values(single character).
TEXT used for columns which will store text which is generally long in length. For
example, if you create a table for storing profile information of a social networking
website, then for about me section you can have a column of type TEXT
SQL: ALTER command
alter command is used for altering the table structure, such as,
DROP command
DROP command completely removes a table from the database. This command will also destroy
the table structure and the data stored in it. Following is its syntax,
DROP TABLE table_name
Here is an example explaining it,
DROP TABLE student;
The above query will delete the Student table completely. It can also be used on Databases, to
delete the complete database. For example, to drop a database,
DROP DATABASE Test;
The above query will drop the database with name Test from the system.
RENAME query
RENAME command is used to set a new name for any existing table. Following is the syntax,
RENAME TABLE old_table_name to new_table_name
Here is an example explaining it.
RENAME TABLE student to students_info;
The above query will rename the table student to students_info.
Using INSERT SQL command
Data Manipulation Language (DML) statements are used for managing data in database.
DML commands are not auto-committed. It means changes made by DML command are
not permanent to database, it can be rolled back.
Talking about the Insert command, whenever we post a Tweet on Twitter, the text is stored
in some table, and as we post a new tweet, a new record gets inserted in that table.
INSERT command
Insert command is used to insert data into a table. Following is its general syntax,
INSERT INTO table_name VALUES(data1, data2, ...)
Lets see an example,
Consider a table student with the following fields.
101 Adam 15
101 Adam 15
102 Alex
101 Adam 15
102 Alex
103 chris 14
UPDATE command
UPDATE command is used to update any record of data in a table. Following is its general
syntax,
UPDATE table_name SET column_name = new_value WHERE some_condition;
WHERE is used to add a condition to any SQL query, we will soon study about it in detail.
101 Adam 15
102 Alex
103 chris 14
101 Adam 15
102 Alex 18
103 chris 14
In the above statement, if we do not use the WHERE clause, then our update query will update
age for all the columns of the table to 18.
101 Adam 15
102 Alex 18
103 Abhi 17
Using DELETE SQL command
When you ask any question in Studytonight's Forum it gets saved into a table. And using
the Deleteoption, you can even delete a question asked by you. How do you think that
works? Yes, using the Delete DML command.
Let's study about the syntax and the usage of the Delete command.
DELETE command
DELETE command is used to delete data from a table.
101 Adam 15
102 Alex 18
103 Abhi 17
102 Alex 18
Isn't DELETE same as TRUNCATE
TRUNCATE command is different from DELETE command. The delete command will delete all
the rows from a table whereas truncate command not only deletes all the records stored in
the table, but it also re-initializes the table(like a newly created table).
For eg: If you have a table with 10 rows and an auto_increment primary key, and if you
use DELETEcommand to delete all the rows, it will delete all the rows, but will not re-initialize
the primary key, hence if you will insert any row after using the DELETE command, the
auto_increment primary key will start from 11. But in case of TRUNCATE command, primary
key is re-initialized, and it will again start from 1.
COMMIT command
COMMIT command is used to permanently save any transaction into the database.
When we use any DML command like INSERT, UPDATE or DELETE, the changes made by these
commands are not permanent, until the current session is closed, the changes made by these
commands can be rolled back.
To avoid that, we use the COMMIT command to mark the changes as permanent.
Following is commit command's syntax,
COMMIT;
ROLLBACK command
This command restores the database to last commited state. It is also used
with SAVEPOINT command to jump to a savepoint in an ongoing transaction.
If we have used the UPDATE command to make some changes into the database, and realise that
those changes were not required, then we can use the ROLLBACK command to rollback those
changes, if they were not commited using the COMMIT command.
Following is rollback command's syntax,
ROLLBACK TO savepoint_name;
SAVEPOINT command
SAVEPOINT command is used to temporarily save a transaction so that you can rollback to that
point whenever required.
Following is savepoint command's syntax,
SAVEPOINT savepoint_name;
In short, using this command we can name the different states of our data in any table and then
rollback to that state using the ROLLBACK command whenever required.
id name
1 Abhi
2 Adam
4 Alex
Lets use some SQL queries on the above table and see the results.
INSERT INTO class VALUES(5, 'Rahul');
COMMIT;
SAVEPOINT A;
SAVEPOINT B;
SAVEPOINT C;
id name
1 Abhi
2 Adam
4 Alex
5 Abhijit
6 Chris
7 Bravo
Now let's use the ROLLBACK command to roll back the state of data to the savepoint B.
ROLLBACK TO B;
id name
1 Abhi
2 Adam
4 Alex
5 Abhijit
6 Chris
Now let's again use the ROLLBACK command to roll back the state of data to the savepoint A
ROLLBACK TO A;
id name
1 Abhi
2 Adam
4 Alex
5 Abhijit
Using GRANT and REVOKE
Data Control Language(DCL) is used to control privileges in Database. To perform any
operation in the database, such as for creating tables, sequences or views, a user needs
privileges. Privileges are of two types,
System: This includes permissions for creating session, table, etc and all types of other
system privileges.
Object: This includes permissions for any command or query to perform any operation
on the database tables.
GRANT: Used to provide any user access privileges or other priviliges for the database.
Syntax for WHERE clause
Here is how you can use the WHERE clause with a DELETE statement, or any other statement,
DELETE FROM table_name WHERE [condition];
The WHERE clause is used at the end of any SQL query, to specify a condition for execution.
Operator Description
= Equal to
!= Not Equal to
SQL LIKE clause
LIKE clause is used in the condition in SQL query with the WHERE clause. LIKE clause compares
data with an expression using wildcard operators to match pattern given in the condition.
Wildcard operators
There are two wildcard operators that are used in LIKE clause.
Example of LIKE clause
Consider the following Student table.
101 Adam 15
102 Alex 18
103 Abhi 17
101 Adam 15
102 Alex 18
103 Abhi 17
Using _ and %
SELECT * FROM Student WHERE s_name LIKE '_d%';
The above query will return all records from Student table where s_name contain 'd' as second
character.
101 Adam 15
Using % only
SELECT * FROM Student WHERE s_name LIKE '%x';
The above query will return all records from Student table where s_name contain 'x' as last
character.
102 Alex 18
ORDER BY Clause
Order by clause is used with SELECT statement for arranging retrieved data in sorted order.
The Order by clause by default sorts the retrieved data in ascending order. To sort the data
in descending order DESC keyword is used with Order by clause.
Syntax of Order By
SELECT column-list|* FROM table-name ORDER BY ASC | DESC;
Using default Order by
Consider the following Emp table,
Group By Clause
Group by clause is used to group the results of a SELECT query based on one or more columns. It
is also used with SQL functions to group the result from one or more tables.
Syntax for using Group by in a statement.
SELECT column_name, function(column_name)
FROM table_name
WHERE condition
GROUP BY column_name
name age
Rohan 34
Shane 29
Anu 22
Example of Group by in a Statement with WHERE clause
Consider the following Emp table
name salary
Rohan 6000
Shane 8000
Scott 9000
You must remember that Group By clause will always come at the end of the SQL query, just like
the Order by clause.
HAVING Clause
Having clause is used with SQL Queries to give more precise condition for a statement. It is
used to mention condition in Group by based SQL queries, just like WHERE clause is used
with SELECT query.
Syntax for HAVING clause is,
SELECT column_name, function(column_name)
FROM table_name
WHERE column_name condition
GROUP BY column_name
HAVING function(column_name) condition
The main objective of the above SQL query was to find out the name of the customer who
has had a previous_balance more than 3000, based on all the previous sales made to the
customer, hence we get the first row in the table for customer Alex
DISTINCT keyword
The distinct keyword is used with SELECT statement to retrieve unique values from the
table. Distinct removes all the duplicate records while retrieving records from any table in
the database.
Syntax for DISTINCT Keyword
SELECT DISTINCT column-name FROM table-name;
Example using DISTINCT Keyword
Consider the following Emp table. As you can see in the table below, there is
employee name, along with employee salary and age.
In the table below, multiple employees have the same salary, so we will be
using DISTINCT keyword to list down distinct salary amount, that is currently being paid to
the employees.
salary
5000
8000
10000
AND & OR operator
The AND and OR operators are used with the WHERE clause to make more precise conditions
for fetching data from database by combining more than one condition together.
AND operator
AND operator is used to set multiple conditions with the WHERE clause,
alongside, SELECT, UPDATE or DELETE SQL queries.
Example of AND operator
Consider the following Emp table
SELECT * FROM Emp WHERE salary < 10000 AND age > 25
OR operator
OR operator is also used to combine multiple conditions with WHERE clause. The only
difference between AND and OR is their behaviour.
When we use AND to combine two or more than two conditions, records satisfying all the
specified conditions will be there in the result.
But in case of OR operator, atleast one condition from the conditions specified must be
satisfied by any record to be in the resultset.
Example of OR operator
Consider the following Emp table
The above query will return records where either salary is greater than 10000 or age is
greater than 25.
1. If there is a bank in that particular city, that person must have an account in that bank.
2. If there is a course in the list of courses required to be graduated, that person must have
taken that course.
Do not worry if you are not clear with all this new things right away, we will try to expain as
we move on with this tutorial.
We shall see the second example, mentioned above, in detail.
Table 1: Course_Taken → It consists of the names of Students against the courses that
they have taken.
Student_Name Course
Robert Databases
David Databases
Course
Databases
Programming Languages
Unfortunately, there is no direct way by which we can express the division operator. Let's
walk through the steps, to write the query for the division operator.
Student_name
Robert
David
Hannah
Tom
Student_Name Course
Robert Databases
David Databases
David Programming Languages
Hannah Databases
Tom Databases
Student_Name Course
Hannah Databases
Tom Databases
Tom Programming Languages
Student_name
David
Hannah
Tom
Student_name
Robert
Hence we just learned, how different steps can lead us to the final answer. Now let us see
how to write all these 5 steps in one single query so that we do not have to create so many
tables.
SELECT DISTINCT x.Student_Name FROM Course_Taken AS x WHERE NOT
EXISTS(SELECT * FROM Course_Required AS y WHERE NOT
EXISTS(SELECT * FROM Course_Taken AS z
WHERE z.Student_name = x.Student_name
AND z.Course = y.Course ))
Student_name
Robert
This gives us the same result just like the 5 steps above
SQL Constraints
SQL Constraints are rules used to limit the type of data that can go into a table, to maintain
the accuracy and integrity of the data inside table.
Constraints can be divided into the following two types,
Constraints are used to make sure that the integrity of data is maintained in the database.
Following are the most used constraints that can be applied to a table.
NOT NULL
UNIQUE
PRIMARY KEY
FOREIGN KEY
CHECK
DEFAULT
NOT NULL Constraint
NOT NULL constraint restricts a column from having a NULL value. Once NOT
NULL constraint is applied to a column, you cannot pass a null value to that column. It
enforces a column to contain a proper value.
One important point to note about this constraint is that it cannot be defined at table level.
UNIQUE Constraint
UNIQUE constraint ensures that a field or column will only have unique values.
A UNIQUE constraint field will not have duplicate data. This constraint can be applied at
column level or table level.
Order_Detail Table
10 Order1 101
11 Order2 103
12 Order3 102
1. On Delete Cascade : This will remove the record from child table, if that value of foriegn
key is deleted from the main table.
2. On Delete Null : This will set all the values in that record of child table as NULL, for
which the value of foriegn key is deleted from the main table.
3. If we don't use any of the above, then we cannot delete data from the main table for
which data in child table exists. We will get an error if we try to do so.
1. Aggregate Functions
2. Scalar Functions
Aggregate Functions
These functions return a single value after performing calculations on a group of values.
Following are some of the frequently used Aggregrate functions.
AVG() Function
Average returns average value after calculating it from values in a numeric column.
Its general syntax is,
SELECT AVG(column_name) FROM table_name
avg(salary)
8200
COUNT() Function
Count returns the number of rows present in the table either based on some condition or
without condition.
Its general syntax is,
SELECT COUNT(column_name) FROM table-name
count(name)
2
Example of COUNT(distinct)
Consider the following Emp table
count(distinct salary)
FIRST() Function
First function returns first value of a selected column
Syntax for FIRST function is,
SELECT FIRST(column_name) FROM table-name;
Using FIRST() function
Consider the following Emp table
first(salary)
9000
LAST() Function
LAST function returns the return last value of the selected column.
Syntax of LAST function is,
SELECT LAST(column_name) FROM table-name;
Using LAST() function
Consider the following Emp table
last(salary)
8000
MAX() Function
MAX function returns maximum value from selected column of the table.
Syntax of MAX function is,
SELECT MAX(column_name) from table-name;
Using MAX() function
Consider the following Emp table
MAX(salary)
10000
MIN() Function
MIN function returns minimum value from a selected column of the table.
Syntax for MIN function is,
SELECT MIN(column_name) from table-name;
Using MIN() function
Consider the following Emp table,
MIN(salary)
6000
SUM() Function
SUM function returns total sum of a selected columns numeric values.
Syntax for SUM is,
SELECT SUM(column_name) from table-name;
Using SUM() function
Consider the following Emp table
SUM(salary)
41000
Scalar Functions
Scalar functions return a single value from an input value. Following are some frequently
used Scalar Functions in SQL.
UCASE() Function
UCASE function is used to convert value of string column to Uppercase characters.
Syntax of UCASE,
SELECT UCASE(column_name) from table-name;
UCASE(name)
ANU
SHANE
ROHAN
SCOTT
TIGER
LCASE() Function
LCASE function is used to convert value of string columns to Lowecase characters.
Syntax for LCASE is,
SELECT LCASE(column_name) FROM table-name;
anu
shane
rohan
scott
tiger
MID() Function
MID function is used to extract substrings from column values of string type in a table.
Syntax for MID function is,
SELECT MID(column_name, start, length) from table-name;
MID(name,2,2)
nu
ha
oh
co
ig
ROUND() Function
ROUND function is used to round a numeric field to number of nearest integer. It is used on
Decimal point values.
Syntax of Round function is,
SELECT ROUND(column_name, decimals) from table-name;
Using ROUND() function
Consider the following Emp table
ROUND(salary)
9001
8001
6000
10000
8000
SQL JOIN
SQL Join is used to fetch data from two or more tables, which is joined to appear as single
set of data. It is used for combining column from two or more tables by using values
common to both tables.
JOIN Keyword is used in SQL queries for joining two or more tables. Minimum required
condition for joining table, is (n-1) where n, is number of tables. A table can also join to
itself, which is known as, Self Join.
Types of JOIN
Following are the types of JOIN that we can use in SQL:
Inner
Outer
Left
Right
ID NAME
1 abhi
2 adam
4 alex
and the class_info table,
ID Address
1 DELHI
2 MUMBAI
3 CHENNAI
ID NAME ID Address
1 abhi 1 DELHI
2 adam 1 DELHI
4 alex 1 DELHI
1 abhi 2 MUMBAI
2 adam 2 MUMBAI
4 alex 2 MUMBAI
1 abhi 3 CHENNAI
2 adam 3 CHENNAI
4 alex 3 CHENNAI
As you can see, this join returns the cross product of all the records present in both the
tables.
ID NAME
1 abhi
2 adam
3 alex
4 anu
and the class_info table,
ID Address
1 DELHI
2 MUMBAI
3 CHENNAI
ID NAME ID Address
1 abhi 1 DELHI
2 adam 2 MUMBAI
3 alex 3 CHENNAI
Natural JOIN
Natural Join is a type of Inner join which is based on column having same name and same
datatype present in both the tables to be joined.
The syntax for Natural Join is,
SELECT * FROM
table-name1 NATURAL JOIN table-name2;
ID NAME
1 abhi
2 adam
3 alex
4 anu
and the class_info table,
ID Address
1 DELHI
2 MUMBAI
3 CHENNAI
1 abhi DELHI
2 adam MUMBAI
3 alex CHENNAI
In the above example, both the tables being joined have ID column(same name and same
datatype), hence the records for which value of ID matches in both the tables will be the
result of Natural Join of these two tables.
OUTER JOIN
Outer Join is based on both matched and unmatched data. Outer Joins subdivide further
into,
ID NAME
1 abhi
2 adam
3 alex
4 anu
5 ashish
and the class_info table,
ID Address
1 DELHI
2 MUMBAI
3 CHENNAI
7 NOIDA
8 PANIPAT
Left Outer Join query will be,
SELECT * FROM class LEFT OUTER JOIN class_info ON (class.id = class_info.id);
The resultset table will look like,
ID NAME ID Address
1 abhi 1 DELHI
2 adam 2 MUMBAI
3 alex 3 CHENNAI
ID NAME
1 abhi
2 adam
3 alex
4 anu
5 ashish
and the class_info table,
ID Address
1 DELHI
2 MUMBAI
3 CHENNAI
7 NOIDA
8 PANIPAT
1 abhi 1 DELHI
2 adam 2 MUMBAI
3 alex 3 CHENNAI
ID NAME
1 abhi
2 adam
3 alex
4 anu
5 ashish
and the class_info table,
ID Address
1 DELHI
2 MUMBAI
3 CHENNAI
7 NOIDA
8 PANIPAT
ID NAME ID Address
1 abhi 1 DELHI
2 adam 2 MUMBAI
3 alex 3 CHENNAI
4 anu null null
ID Name
1 abhi
2 adam
3 alex
4 anu
5 ashish
and the class_info table,
ID Address
1 DELHI
2 MUMBAI
3 CHENNAI
7 NOIDA
8 PANIPAT
Below is the Query to fetch data from both the tables using SQL Alias,
SELECT C.id, C.Name, Ci.Address from Class AS C, Class_info AS Ci where C.id = Ci.id;
and the resultset table will look like,
ID Name Address
1 abhi DELHI
2 adam MUMBAI
3 alex CHENNAI
SQL Alias seems to be quite a simple feature of SQL, but it is highly useful when you are
working with more than 3 tables and have to use JOIN on them
1. UNION
2. UNION ALL
3. INTERSECT
4. MINUS
UNION Operation
UNION is used to combine the results of two or more SELECT statements. However it will
eliminate duplicate rows from its resultset. In case of union, number of columns and
datatype must be same in both the tables, on which UNION operation is being applied.
Example of UNION
The First table,
ID Name
1 abhi
2 adam
The Second table,
ID Name
2 adam
3 Chester
ID NAME
1 abhi
2 adam
3 Chester
UNION ALL
This operation is similar to Union. But it also shows the duplicate rows.
ID NAME
1 abhi
2 adam
The Second table,
ID NAME
2 adam
3 Chester
ID NAME
1 abhi
2 adam
2 adam
3 Chester
INTERSECT
Intersect operation is used to combine two SELECT statements, but it only retuns the records
which are common from both SELECT statements. In case of Intersect the number of
columns and datatype must be same.
NOTE: MySQL does not support INTERSECT operator.
Example of Intersect
The First table,
ID NAME
1 abhi
2 adam
The Second table,
ID NAME
2 adam
3 Chester
ID NAME
2 adam
MINUS
The Minus operation combines results of two SELECT statements and return only those in the
final result, which belongs to the first set of the result.
Example of Minus
The First table,
ID NAME
1 abhi
2 adam
The Second table,
ID NAME
2 adam
3 Chester
ID NAME
1 abhi
Creating a Sequence
Syntax to create a sequence is,
CREATE SEQUENCE sequence-name
START WITH initial-value
INCREMENT BY increment-value
MAXVALUE maximum-value
CYCLE | NOCYCLE;
ID NAME
1 abhi
2 adam
4 alex
ID NAME
1 abhi
2 adam
4 alex
1 anu
Once you use nextval the sequence will increment even if you don't Insert any record into
the table.
SQL VIEW
A VIEW in SQL is a logical subset of data from one or more tables. View is used to restrict
data access.
Syntax for creating a View,
CREATE or REPLACE VIEW view_name
AS
SELECT column_name(s)
FROM table_name
WHERE condition
As you may have understood by seeing the above SQL query, a view is created using data
fetched from some other table(s). It's more like a temporary table created with data.
Creating a VIEW
Consider following Sale table,
SQL Query to Create a View from the above table will be,
CREATE or REPLACE VIEW sale_view
AS
SELECT * FROM Sale WHERE customer = 'Alex';
The data fetched from SELECT statement will be stored in another object called sale_view.
We can use CREATE and REPLACE seperately too, but using both together works better, as if
any view with the specified name exists, this query will replace it with fresh data.
Displaying a VIEW
The syntax for displaying the data in a view is similar to fetching data from a table using
a SELECTstatement.
SELECT * FROM sale_view;
Update a VIEW
UPDATE command for view is same as for tables.
Syntax to Update a View is,
UPDATE view-name SET VALUE
WHERE condition;
NOTE: If we update a view it also updates base table data automatically.
Read-Only VIEW
We can create a view with read-only option to restrict access to the view.
Syntax to create a view with Read-Only Access
CREATE or REPLACE FORCE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition WITH read-only;
The above syntax will create view for read-only purpose, we cannot Update or Insert data
into read-only view. It will throw an error.
Types of View
There are two types of view,
Simple View
Complex View