DBMS Aryan

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Data

 Data is a collection of characters, numbers, and other symbols that


represents values of some situations or variables.
 Example: Name, age, gender, contact details, etc., of a person
 Online posts, messages, Images, graphics, animations, audio,
video, etc.
 Types: Structured and Unstructured
 Structured Data: Data which is organized and can be recorded in
a well defined format like tabular is called structured data.
 Unstructured data: which are not in the traditional tabular
structure is called unstructured data.
 Example: Web pages consisting of text as well as multimedia
contents. text documents, books, audio/video files, social media
messages etc.
 Information: It is a processed data.
Methods to store and manage data

1. Manual record keeping(Traditional method)


2. File system
3. DBMS
File system

 File is a container to store data in a computer.


 Files can be stored on the storage device of a computer
system.
 File system is a software that manage and organizes the files
in a storage medium. It controls how data is stored and
retrieved.
Limitations of a File System
 Difficulty in Access: Files stored on a computer can be accessed directly and
searched for desired data. But to access data of a file through software, we need
to write program.
 Data Redundancy: Redundancy means same data are duplicated in different
places (files). It leads to excess storage use.
 Data inconsistency: Data inconsistency occurs when same data maintained in
different places do not match.
 Data Isolation: no link or mapping between tables. data mapping is not
supported in file system.
 Data Dependence: Data are stored in a specific format or structure in a file.
Updating the structure of a data file requires modification in all the application
programs accessing that file.
 Controlled Data Sharing : Every user have different access control. Its difficult
to manage in a file system because of writing separate program for each user.
DBMS
 Limitations of file system can be overcome by storing the data in a
database where data are logically related.
 DBMS is a software that can be used for creating and managing
databases.
 Manage means: update/modify and retrieve data
 Example: MySQL, Oracle, SQL Server, Microsoft Access,
MongoDB.
 DBMS is a interface between the database and end users.
Retrieving data from a database through special type of commands
is called querying the database.
 A database is a collection of tables.
 Database Schema: Design of a database or it’s a skeleton of the database that
represents the structure (table names and columns), type of data each column can
hold, constraints, and the relationships among the tables.
 Data Constraint: Restrictions or limitations on the type of data. Roll number is
unique. It ensure accuracy and reliability of data in the database.
 Meta-data or Data Dictionary: Data about the data.
 Database Instance/state: state or snapshot of the database at any given time is the
database instance. A schema can have many instances at different times.
 Query: A query is a request to a database for obtaining information in a desired way.
It is written in Structured Query Language (SQL).
 Data Manipulation: Modification of database consists of three operations Insertion,
Deletion or Update.
 Database Engine: Database engine is the underlying component or set of programs
used by a DBMS to create database and handle various queries for data retrieval and
manipulation.
Limitations of DBMS

 Increased Complexity: for maintaining functionalities like security,


consistency, sharing and integrity.
 High Cost: The cost of implementing a DBMS system, training to users,
write application program is very high.
 Increased data vulnerability: increases the chances of loss of data due
to any failure.
 Hence the DBMS approach is usually not preferred when the database is
small, well defined, less frequently changed and used by few users.
Relational Data Model
 A data model describes the structure of the database, including how data are
defined and represented, relationships among data, and the constraints.
 Relational Data Model or Relational DBMS (RDBMS) is used to store data in
related tables.
 Developed by E.F Codd at IBM in 1970.
 RELATION: Tables are called relations
 TUPLE: Rows are called tuple or record. A tuple is a collection of attribute values
that makes a record unique.
 ATTRIBUTE: Column are called field or attribute. Characteristic or parameters for
data.
 DOMAIN: It is a set of values from which an attribute can take a value in each row.
In RollNumber domain is a set of integer values.
 DEGREE: The number of attributes in a relation.
 CARDINALITY: The number of tuples in a relation.
 A tuple is a unique entity whereas attribute values can be duplicate.
Three Important Properties of a Relation:
 Property 1: rules on an attribute
• Each attribute in a relation has a unique name.
• Sequence of attributes in a relation is immaterial.
 Property 2: rules on a tuple
• Each tuple in a relation is distinct.
• Sequence of tuples in a relation is immaterial.
 Property 3: rules on the state of a relation.
• All data values in an attribute must be from the same domain.
• Each data value associated with an attribute must be atomic (cannot be
further divisible into meaningful subparts). example, Phone has ten digit
numbers which is indivisible.
• No attribute can have many data values in one tuple. For example,
Guardian cannot specify multiple contact numbers under GPhone
attribute.
• A special value “NULL” is used to represent values that are unknown or
non-applicable/ no value to certain attributes. 0 is not null.
Keys in a Relational Database
 Key: It is a field or set of fields that uniquely distinguish each tuple of a
relation.
 Candidate Key: an attribute can be used to uniquely identify the tuples in
the relation is called candidate key. Multiple CK for a single table
 Primary Key: Out of one or more candidate keys, the attribute chosen which
uniquely identify the tuples in a relation is called the primary key. Exact one
PK for a single table.
 Remaining candidate keys are called the alternate keys.
 Composite Primary Key: primary key consisting of more than one attribute
is called Composite Primary key.
 Foreign Key: represent the relationship between two relations. A foreign key
is an attribute whose value is derived from the primary key of another
relation.
PK CK FK AK

Unique values unique no unique

No duplicate no allowed no
values

No null values no yes no

Single PK for one Multiple Multiple multiple


table
SQL
 SQL is a query language that can be used to access and manipulate
data from the database.
 SQL is case insensitive.
 Always end SQL statements with a semicolon (;).
Data types of SQL
 Data type of an attribute indicates the type of data value that an attribute
can have.
1. Numeric: int and float
2. Date and time: date, time,
3. String: char, varchar, text
 Char: CHAR is of fixed length, means it reserve all the spaces which u
defined in size. Size is 0 to 255 Char(size)
 Varchar(size): is a variable-length data type. Size is 0 to 65535
characters. VARCHAR (30) means a maximum of 30 characters can be
stored but the actual allocated bytes will depend on the length of entered
string.
 DATE: its format is 'YYYY-MM-DD‘.
Constraints
 Constraints are the certain types of restrictions/rules on the data values that an
attribute can have.
 It ensure correctness of data.
1. NOT NULL: Cannot have NULL values. NULL means missing/ unknown/not
applicable value.
2. UNIQUE : Ensures that all the values in a column are distinct/unique.
3. DEFAULT: A default value specified for the column if no value is provided.
4. CHECK: all values in a column satisfy certain condition.
5. PRIMARY KEY: Used to uniquely identify each row/record in a table.
6. FOREIGN KEY: used to ensure referential integrity of the data.
SQL Commands/Categories
 DDL: Data Definition Language. It allows us to write statements for
defining, modifying and deleting relation schemas/table
structure(create/alter/drop db, table, data types, constraints) is created.
Create, alter, drop.
 DML: Data Manipulation Language. It allows to insertion of new
data, removal of existing data or modification of existing data in the
database. Insert, update, delete
 DQL: Data Query Language. SELECT is used to retrieve data from
the tables in a database. Select
 Create:It define relations in the database and specify attributes for each
relation along with data type and constraint for each attribute.
CREATE TABLE tablename( attributename1 datatype
constraint,……attributenameN datatype constraint);
 Describe Table: It allows us to view the structure of an already created
table. Desc table_name;
 Alter : ALTER TABLE statement is used to make changes in the structure
of a table like adding, removing columns or changing/modify datatype or
column(s) or to add constraint in attribute.
 ALTER TABLE table_name ADD attribute_name DATATYPE;
 ALTER TABLE table_name ADD UNIQUE (attribute name);
 ALTER TABLE table_name MODIFY attribute DATATYPE;
 ALTER TABLE table_name CHANGE old_col_name new_col_name DT;
 ALTER TABLE table_name DROP attribute;
 ALTER TABLE table_name DROP PRIMARY KEY;
 DROP: use to remove a database/table permanently from the system. It
cannot be undone. DROP TABLE table_name;
 INSERT INTO statement is used to insert new records in a table.
1. INSERT INTO tablename VALUES(value 1, value 2,....);
2. INSERT INTO tablename (column1, column2, ...) VALUES (value1, value2,
...);
 SELECT attribute1, attribute2, ... FROM table_name WHERE condition;
 Select all the data available in a table: SELECT * FROM table_name;
 Renaming of columns in output: by using the alias 'AS‘.
 SELECT column_name as Name FROM tablename;
 Distinct Clause: The SELECT statement when combined with DISTINCT
clause, returns records without repetition. Or eliminate duplicate rows from the
result of s select statement.
 SELECT DISTINCT DeptId FROM EMPLOYEE;
 WHERE clause is used to retrieve data that meet specified condition(s).
 Also use and, or, not, between and, in, not in logical operators.
 SELECT * FROM EMPLOYEE WHERE Salary > 5000 OR DeptId= 20;
 SELECT * FROM EMPLOYEE WHERE DeptId IN ('D01', 'D02' , 'D04');
 SELECT Ename, DeptId FROM EMPLOYEE WHERE Salary BETWEEN
20000 AND 50000;
 ORDER BY CLAUSE: SELECT * FROM EMPLOYEE ORDER BY Salary;
 SELECT * FROM EMPLOYEE ORDER BY Salary DESC; in descending
 IS NULL: NULL is used to represent unknown values. 5 + NULL = NULL
 SELECT * FROM EMPLOYEE WHERE Bonus IS NULL;
Substring pattern matching

 LIKE operator that can be used with the WHERE clause to search
for a specified pattern in a column.
 Wild card characters:
1. % (per cent)- used to represent zero, one, or multiple characters
2. _ (underscore)- used to represent exactly a single character
 SELECT * FROM EMPLOYEE WHERE Ename like ‘s%';
 UPDATE: make changes in the value(s) of one/more columns.
 UPDATE table_name SET attribute1 = value1, attribute2 = value2,
...WHERE condition;
 DELETE statement is used to delete/remove one/more/all records
from a table. It can be undo.
 DELETE FROM table_name WHERE condition;
Functions
 Function is used to perform some particular task and it returns zero
or more values as a result.
 Types:
1. Single Row functions(Scalar functions): applied on a single
value and return a single value.
2. Multiple row/Aggregate functions: Applied on a set of records
as a whole, and return a single value for each column of the
records on which the function is applied.
Single row functions
 Numeric(Math) Functions: Math Functions accept numeric
value as input and return a numeric value as a result.
 String Functions: String Functions accept character value as
input and return either character or numeric values as output.
 Date Functions: Date and Time functions accept date and time
value as input and return numeric or string or Date and Time as
output.
1. Numeric function

 POWER(X,Y): SELECT POWER(2,3); o/p:8


 ROUND(N,D): SELECT ROUND(2912.564, 1); 2912.6
SELECT ROUND(283.2);Output: 283
D=0, then it rounds off the number to the nearest integer.
 MOD(A, B): Returns the remainder. SELECT MOD(21, 2);
 UPDATE INVENTORY SET FinalPrice=Price+Round(Price*12/100,1);
2. String functions
 UCASE(string) OR UPPER(string): converts string into uppercase.
Select Ucase(“Information”); Output: INFORMATION
 LOWER(string) OR LCASE(string)
 MID(string, pos, n) OR SUBSTRING(string, pos, n) OR SUBSTR(string, pos, n):
Returns a substring of size n starting from the specified position (pos) of the string. If
n is not specified, it returns the substring from the position pos till end of the string.
SELECT MID(“Informatics”, 3, 4); Output: form
 LENGTH(string)
 LEFT(string, N): Returns N number of characters from the left side of the string.
SELECT LEFT(“Computer”, 4); Output: Comp
 RIGHT(string, N)
 INSTR(string, substring): Returns the position of the first occurrence of the substring
in the given string. Returns 0, if the substring is not present in the string.
 LTRIM(string): Returns the given string after removing leading white space
characters. RTRIM(STR) and TRIM(STR)
SELECT LENGTH(“ DELHI”), LENGTH(LTRIM(“ DELHI”)); Output: 7 5
3. Date and Time Functions
 NOW(): It returns the current system date and time.
 DATE(): It returns the date part from the given date/ time
expression.
SELECT DATE(NOW()); Output: 2019-07-11
 MONTH(date): It returns the month in numeric form from the date.
SELECT MONTH(NOW()); Output: 7
 MONTHNAME(date): It returns the month name from the specified
date.
 YEAR(date): It returns the year from the date.
SELECT YEAR(“2003-10-03”);
 DAY(date): It returns the day part from the date.
SELECT DAY(“2003-03-24”); Output: 24
 DAYNAME(date): It returns the name of the day from the date.
Multiple row functions OR
Aggregate Functions
Aggregate Function
 MAX(column): Returns the largest value from the specified
column. SELECT MAX(Price) FROM INVENTORY;
 MIN(column):
 AVG(column): Returns the average of the values in the
specified column.
 SUM(column): Returns the sum of the values for the
specified column.
 COUNT(column): Returns the number of values in the
specified column ignoring/exclude the NULL values.

 COUNT(*): Returns the number of records in a table.


GROUP BY clause
 It groups the rows together that contain the same
values in a specified column.
 It is used in select statements to divide the table into
groups.
 We can use the aggregate functions (COUNT, MAX,
MIN, AVG and SUM) to work on the grouped values.
 Rules:
1. Used with select
2. It is placed before order by and after where.
3. While grouping, we include only those values in the
select list that either have the same value for a group
or an aggregate function.
HAVING Clause

 HAVING Clause in SQL is used to specify conditions


on the rows with GROUP BY clause.
 Used to place condition on groups.
 It is placed after group by.
 It decide which group will be the part of the result set.
 Aggregate function can not be used with where
clause, so we use having clause.
WHERE HAVING
Applicable on individual rows Applicable on groups
Not include aggregate function Include
Only used with group by clause
Cartesian Product (X)

 It combines tuples from two relations. The result is all


combinations of tuples from the underlying tables.
 Degree of the resulting relation=the sum of the degrees of both the
relations
 Cardinality of the resulting relation = the product of the
cardinality of both relations
 We have to specify the table names by separating commas in the
FROM clause.
 We can also use where clause for specifying a condition.
Join
 JOIN operation combines tuples from two tables on specified conditions on related
attribute.
 Related attributes is the primary key in one table and foreign key in another table.
 Condition: there is a common attribute between the two tables.
 CP+where->SELECT * FROM UNIFORM U, COST C WHERE U.UCode = C.UCode;
 Join-> SELECT * FROM UNIFORM U JOIN COST C ON U.Ucode=C.Ucode;
NATURAL JOIN

 There is an extension of JOIN operation called NATURAL


JOIN which works similar to JOIN clause in SQL but
removes the redundant attribute.
 This operator can be used to join the contents of two tables
if there is one common attribute in both the tables.
 N-1 joins are needed to combine N tables on equality
condition.

You might also like