Class 6

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

CS F212: Database Systems

Today’s Class
 Introduction
 DB System Architecture
 Data Models
 Relational Model

CS F212 Database Systems 1


Storage Management

• A storage manager is a program module that provides the


interface between the low-level data stored in the
database and the application programs and queries
submitted to the system.
• The storage manager is responsible for the following tasks:
• interaction with the file manager
• efficient storing, retrieving, and updating of data.
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
Query Processing (Cont.)
• Alternative ways of evaluating a given query
• Equivalent expressions
• Different algorithms for each operation
• Cost difference between a good and a bad way of
evaluating a query can be enormous
• Need to estimate the cost of operations
• Depends critically on statistical information about
relations which the database must maintain
• Need to estimate statistics for intermediate results to
compute cost of complex expressions
Transaction Management

•A transaction is a collection of operations that performs a single logical


function in database application
time
Transaction 1

Transaction 1

Transaction 2

Conflicting read/write
Transaction Management (cont.)
 Transaction-management component ensures that
the database remains in a consistent (correct) state
despite system failures (e.g. power failures and
operating system crashes) and transaction failures.
 Concurrency-control manager controls the interaction
among the concurrent transactions, to ensure the
consistency of the database.
Database Administrator (DBA)
• Coordinates all the activities of the database system; the database
administrator has good understanding of the enterprise’s information
resources and needs.
• Database administrator’s duties include:
• Schema definition Primary job of a database
designer
• Specifying integrity constraints

• Storage structure and access method definition More system


oriented
• Schema and physical organization modification
• Granting user authority to access the database
• Monitoring performance and responding to changes in
requirements
Database Users
• Users are differentiated by the way they expected to interact with the
system
• Application programmers
• Develop applications that interact with DBMS through DML calls
• Sophisticated users
• form requests in a database query language
• mostly one-time ad hoc queries
• End users
• invoke one of the existing application programs (e.g., print monthly sales
report)
• Interact with applications through GUI
These layers
Structure of a DBMS must consider
concurrency
control and
recovery
• A typical DBMS has a layered Query Optimization
architecture. and Execution
• The figure does not show the
Relational Operators
concurrency control and recovery
components. Files and Access Methods
• This is one of several possible
Buffer Management
architectures; each system has its
own variations. Disk Space Management

DB
Overall System Architecture
User / DB Administrator
Transaction
Application
Commands
DDL Commands

Query Transaction DDL


Compiler Manager Compiler

Query plan

Execution Logging & Concurrency


Engine Recovery Control

Index, file and record


requests
Meta data Meta data
Index/file/ Statistics
Log pages
record Lock
manager Table
Data, metadata,
Page
indexes
Commands

Buffer
BUFFERS
Manager

Read/write pages

Storage
Manager
Architecture of
Modern DBMS
Application Architectures

Two-tier architecture: E.g. client programs using ODBC/JDBC to

communicate with a database


Three-tier architecture: E.g. web-based applications, and
applications built using “middleware”
Characteristics of a Modern DBMS
• Data independence and efficient access.
• Abstraction - hiding lower level details
• Efficient data access
• Indexing - Significant for very large databases
• Data integrity and security
• Application independent data integrity features
• Simpler Access control mechanisms - Views
• Uniform data administration.
• Concurrent access, recovery from crashes.
• Reduced application development time
• Many important tasks are handled by DBMS
Summary
• DBMS used to maintain, query large datasets.
• Benefits include recovery from system crashes,
concurrent access, quick application development,
data integrity and security.
• Levels of abstraction give data independence.
• A DBMS typically has a layered architecture.
• DBAs hold responsible jobs and are well-paid! 
• DBMS R&D is one of the broadest,
most exciting areas in CS.
Data Models

• Concepts that help to describe data types, relationships,


and constraints.

• High level/conceptual:
• Very close to user
• Entity
• Attribute
• Relationship
e.g. Entity Relationship model

• Low level / physical data model


• How data is stored and organized in storage media
• Access path
• Index
Data Models

• Representational/ Implementation
• Easily understood by end user
• Not far from data organization and storage
• Relational data model (network, hierarchical data models)

• Object data model


• Higher level implementation model closer to conceptual data model

• Self-describing data models


• Schema of data and data combined
Data Model : Basic Building Blocks

• Entity is anything about which data are to be


collected and stored
• Attribute is a characteristic of an entity
• Relationship describes an association among (two
or more) entities
• One-to-many (1:M) relationship
• Many-to-many (M:N or M:M) relationship
• One-to-one (1:1) relationship
Business Rules

• Brief, precise, and unambiguous description of a


policy, procedure, or principle within a specific
organization’s environment
• Apply to any organization that stores and uses data
to generate information
• Description of operations that help to create and
enforce actions within that organization’s
environment
Business Rules

• Must be rendered in writing


• Must be kept up to date
• Sometimes are external to the organization
• Must be easy to understand and widely
disseminated
• Describe characteristics of the data as viewed by
the company
Sources of Business Rules

• Company managers
• Policy makers
• Department managers
• Written documentation
• Procedures
• Standards
• Operations manuals
• Direct interviews with end users
History of Data Models
• Relational Model: proposed in 1970 by E.F. Codd (IBM), first
commercial system in 1981-82. Now in several commercial products
(DB2, ORACLE, SQL Server, SYBASE, INFORMIX).
• Network Model: the first one to be implemented by Honeywell in 1964-
65 (IDS System). Adopted heavily due to the support by CODASYL
(CODASYL - DBTG report of 1971). Later implemented in a large variety
of systems - IDMS (Cullinet - now CA), DMS 1100 (Unisys), IMAGE (H.P.),
VAX -DBMS (Digital Equipment Corp.).
• Hierarchical Data Model: implemented in a joint effort by IBM and
North American Rockwell around 1965. Resulted in the IMS family of
systems. The most popular model of its time. Other system based on
this model: System 2k (SAS inc.)
History of Data Models
• Object-oriented Data Model(s): several models have been proposed
for implementing in a database system. One set comprises models
of persistent O-O Programming Languages such as C++ (e.g., in
OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE).
Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at
H.P.- used in Open OODB).
• Object-Relational Models: Combines the features of OO data model
and relational data model. Started with Informix Universal Server.
Exemplified in the latest versions of Oracle-11g, DB2, and SQL Server
etc. systems.
• Semi - structured Data Models : Data items of same type may have
different set of attributes. XML data model, JSON data objects.
Hierarchical Database Model
• Logically represented by an upside down tree
• Each parent can have many children
• Each child has only one parent
Network Database Model
• Each record can have multiple parents
• Composed of sets
• Each set has owner record and member record
• Member may have several owners
The Relational Model
Introduction
• Proposed by Edgar. F. Codd(1923-2003) in the early seventies. [ Turing
Award –1981 ]
• Most of the modern DBMS are relational.
• Simple and elegant model with a mathematical basis.
• Led to the development of a theory of data dependencies and
database design.
• Relational algebra operations
• crucial role in query optimization and execution.
• Laid the foundation for the development of
• Tuple relational calculus and then
• Database standard SQL
Basic Concepts

Entities and relationships are stored in tables


Relationships are captured by including key of
one table into another
Languages for manipulating the tables
All popular DBMSs today are based on
relational data model (or an extension of it, e.g.,
object-relational data model)
Why is it so good?

Simplicity, everybody knows how to manipulate tables


Tables are simple enough so that solutions to complicated
problems such as concurrency control and query
optimization can be obtained
It has a theoretical basis for the studying of database
design problems
Tables are logical concepts; physically tables can be stored
in different ways  support data independence
Terminology
Relation  table; denoted by R(A1, A2, ..., An) where R is a relation
name and (A1, A2, ..., An) is the relation schema of R
Attribute  column; denoted by Ai
Tuple  row
Attribute value  value stored in a table cell
Domain  legal type and range of values of an attribute
denoted by dom(Ai)
 Attribute: Age Domain: [0-100]
 Attribute: EmpName Domain: 50 alphabetic chars
 Attribute: Salary Domain: non-negative integer
Ideally, a domain can be defined in terms of another domain; e.g., the domain of
EmpName is PersonName. This is NOT allowed in most basic DBMSs.
However, most recent DBMSs allows this (object-relational) extension such as Oracle
11g.

You might also like