Elmasri 6e Ch02 SV

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 58

Chapter 2

Database System
Concepts and
Architecture

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley


Chapter 2 Outline
 Data Models, Schemas, and Instances
 Three-Schema Architecture and Data
Independence
 Database Languages and Interfaces
 The Database System Environment
 Centralized and Client/Server Architectures
for DBMSs
 Classification of DBMSs

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data abstraction?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data Models, Schemas, and
Instances
 Data abstraction
 Suppression of details of data organization and
storage
 Highlighting of the essential features for an
improved understanding of data

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data model?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data Models, Schemas, and
Instances (cont'd.)
 Data model
 Collection of concepts that describe the structure of
a database
 Provides means to achieve data abstraction
 Basic operations
• Specify retrievals and updates on the database
 Certain constraints that the DB should obey
• Constraints specify some restrictions on valid data; these constraints must be enforced at all times

 Dynamic aspect or behavior of a database


application
• Allows the database designer to specify a set of valid operations allowed on database objects

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Categories of Data Models
 High-level (or conceptual) data models
 Close to the way many users perceive data
 (Entity-Relationship Model)

 Representational (or implementation) data models


 Easily understood by end users
 Also similar to how data organized in computer storage
 Relational Data Model-(Commonly SQL Standard is used to implement Relational Data Model for Relational DBs)

 Low-level (or physical) (or internal) data models


 Describe the details of how data is stored on computer
storage media
 Physical Data Models describes the access paths to make the search for particular database records efficient

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Categories of Data Models
(cont'd.)
 High-level (or conceptual) data models
 Entity
• Represents a real-world object or concept
 Attribute
• Represents some property of interest
• Further describes an entity
 Relationship among two or more entities
• Represents an association among the entities
 Entity-Relationship model

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Categories of Data Models
(cont'd.)
 Representational (or implementation) data models
 Relational data model
• Used most frequently in traditional commercial
DBMSs
• Commonly SQL Standard is used to implement Relational Data Model for Relational Databases

 Object data model


• New family of higher-level implementation data
models
• Closer to conceptual data models

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Categories of Data Models
(cont'd.)
 Low-level (or physical) (or internal) data models
 Physical data models
• Describe how data is stored as files in the computer
• Access path
• Structure that makes the search for particular database
records efficient
• Index
• Example of an access path
• Allows direct access to data using an index term or a
keyword

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Schemas, Instances, and
Database State
 Database schema
 Description of a database structure (data types, and constraints)
 Schema diagram
 Displays selected aspects of schema
 Schema construct (Entity, Record(type), Table)

 Each object (component) in the schema (e.g. STUDENT, COURSE)


 Database state or snapshot
 The actual data in a database at a particular
moment in time
 The term instance is applied to individual DB components, e.g. record instance, table instance, entity instance

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Schemas, Instances, and
Database State (cont'd.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Schemas, Instances, and
Database State (cont'd.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Schemas, Instances, and
Database State (cont'd.)
 Define a new database
 Specify database schema to the DBMS
 Initial state
 Populated or loaded with the initial data
 Valid state
 Satisfies the structure and constraints specified
in the schema

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Database Schema vs. Database
State
 Distinction
 The database schema changes very
infrequently. (Changes applied to schema as application requirements change)
 The database state changes every time the
database is updated.

 Schema is also called intension.


 State is also called extension.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Three-Schema Architecture
and Data Independence
 External or view level
 Describes part of the database that a particular user group is
interested in
 Usually uses the same data model as the conceptual schema.

 Conceptual level
 Describes structure of the whole database for a community of
users
 Uses a conceptual or an implementation data model.

 Internal level
 Describes physical storage structure (access paths (e.g.
İndexes) of the database
 Typically uses a physical data model.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Three-Schema Architecture
and Data Independence (cont'd.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Mapping?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Three-Schema Architecture
and Data Independence
 Mappings among schema levels of a DBMS are
needed to transform requests and data. (Mappings: The
process of transforming requests and data between levels of three-schema architecture of DBMSs) (A DBMS uses additional
SW to accomplish these mappings by referring to the mapping information in the catalog)

 Programs refer to an external schema, and are mapped by the


DBMS to the internal schema for execution.
 Data extracted from the internal DBMS level is reformatted to
match the user’s external view (e.g. formatting the results of an
SQL query for display in a Web page)
 Three-Schema Architecture: Not explicitly used in
commercial DBMS products, but has been useful in
explaining database system organization (The two levels of
mappings create an overhead during compulation or execution of query or program, leading to inefficiencies in the DBMS.
Therefore, few DBMS have inplemented the full three-schema architecture.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data Independence?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data Independence
 Capacity to change the schema at one level
of a database system
 Without having to change the schema at the next
higher level
• When a schema at a lower level is changed, only the mappings between this
schema and higher-level schemas need to be changed in a DBMS that fully
supports data independence.
• The higher-level schemas themselves are unchanged.

 Types:
 Logical
 Physical

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data Independence
 Logical Data Independence:
 The capacity to change the conceptual schema
without having to change the external schemas
and their associated application programs.
 The application programs need not be changed since they refer to the external schemas

 Physical Data Independence:


 The capacity to change the internal schema
without having to change the conceptual schema.
• For example, the internal schema may be changed when certain file structures are
reorganized or new indexes are created to improve database performance
• Generally, physical data independence exists in most databases and file environments where physical details
such as the exact location of data on disk, and hardware details of storage encoding, placement, compression,
splitting, merging of records, and so on are hidden from the user. Applications remain unawaire of these
details.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


DBMS Languages?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


DBMS Languages
 View definition language (VDL)
• Specifies user views/mappings to conceptual schema

 Data definition language (DDL)


 Used by the DBA and database designers to specify the conceptual
schema of a database.
 In some DBMSs, the DDL is also used to define internal and external
schemas (views).
 In some DBMSs, separate storage definition language (SDL) and
view definition language (VDL) are used to define internal and
external schemas.

 Storage definition language (SDL)


• Specifies the internal schema (SDL is typically realized via DBMS commands (combination of
functions, parameters and specifications related to storage) provided to the DBA and database designers)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


DBMS Languages (cont'd.)
 Data manipulation language (DML)
• Allows retrieval, and updates (insertion, deletion, modification)
 High-level or nonprocedural DML
• Can be used on its own to specify complex database operations
concisely or can be embedded in a programming language (e.g. C, C+
+, Java,...) (A library of functions can also be provided to access the DBMS from a programming language)
• Set-at-a-time or set-oriented (e.g. SQL relational language) (specify what
data to retrieve rather than how to retrieve it)

 Low-level or procedural DML


• Must be embedded in a general-purpose programming language (e.g.
C, C++, Java,...)
• Record-at-a-time (Constructs such as looping are needed to retrieve multiple records, along with
positioning pointers.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


DBMS Interfaces?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


DBMS Interfaces
 User-friendly interfaces
 Menu-based interfaces (generally desined for Web clients or browsing)
 Forms-based interfaces (generally desined for naïve or parametric users)
 Graphical user interfaces (Point and Click, Drag and Drop, etc.)
 Combinations of the above interfaces

 Natural language interfaces (requests in written English)

 Speech input and output


 Stand-alone query language interfaces (e.g. SQL*Plus in
ORACLE)

 Interfaces for the DBA (designed for creating user accounts, granting authorizations, setting
system parameters, changing schemas or access paths, ...)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


DBMS Programming Language
Interfaces
 Programmer interfaces for embedding DML in a
programming languages:
 Embedded Approach: e.g embedded SQL (for C, C++,
etc.), SQLJ (for Java)
 Procedure Call Approach: e.g. JDBC (Java DB Connectivity) for
Java, ODBC for other programming languages (Open DB
Connectivity- provides an application programming interface (API) which allows client-side
programs to call the DBMSs, as long as both client and server machines have the
necessary software installed)
 Database Programming Language Approach: e.g.
ORACLE has PL/SQL, a programming language based on
SQL; language incorporates SQL and its data types as
integral components

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


The Database System
Environment
 DBMS component modules
 Buffer management
 Stored data manager
 DDL compiler
 Interactive query interface
• Query compiler
• Query optimizer
 Precompiler
 Runtime database processor
 System catalog
 Concurrency control system
 Backup and recovery system

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Database System Utilities
 Loading
 Load existing data files (Includes data conversion tools)
 Backup
 Creates a backup copy of the database periodically
 Database storage reorganization
 Reorganize a set of database files into different file organizations
 Other functions, such as sorting files, data compression, etc.
 Performance monitoring
 Monitors database usage and provides statistics to the DBA
 Report generation utilities
 Other functions, user monitoring, interfacing with the
network, etc.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Tools, Application Environments,
and Communications Facilities
 Application development environments and
CASE (computer-aided software engineering)
Tools (available to DB desiners)

 Examples:
• PowerBuilder (Sybase)
• JBuilder (Borland)
• JDeveloper 10G (Oracle)

 Communications software
 Allow users at location remote from the DB system site to access the
database through computer terminals, workstations, or personal
computers. (The integrated DBMS and data communications systems is called a DB/DC
system.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Tools, Application Environments,
and Communications Facilities
 Data dictionary (data repository) system
 Stores schema descriptions and other
information such as design decisions,
application program descriptions, user
information, usage standards, etc.
• Active data dictionary is accessed by DBMS
software and users/DBA.
• Passive data dictionary is accessed by users/DBA
only.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Centralized Architectures for
DBMSs?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Centralized and Client/Server
Architectures for DBMSs
 Centralized DBMSs Architecture
 Combines everything into single system(carried out
on one machine)including- DBMS software, hardware,

application programs, and user interface


processing software.
 User can still connect through a remote
terminal – however, all processing is done at
centralized site.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Client/Server Architectures for
DBMSs?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Basic Client/Server Architectures
(cont'd.)
 Client
 User machine that provides user interface
capabilities to utilize associated servers and
local processing power to run local applications
 Server
 System containing both hardware and software
 Provides services to the client machines
• Such as file access, printing, archiving, or database
access

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Basic Client/Server Architectures
 Specialized Servers - Servers with specific
functionalities
 File server
• Maintains the files of the client machines.
 Printer server
• Connected to various printers; all print requests by the
clients are forwarded to this machine
 Web servers or e-mail servers
 DBMS server
 Clients can access the specialized servers as needed

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Two-Tier Client/Server
Architectures for DBMSs
 Client
 Provide appropriate interfaces through a client software
module to access and utilize the various server resources.
• Clients may be diskless machines or PCs or Workstations with disks with
only the client software installed.
• Connected to the servers via some form of a network.
• (LAN: local area network, wireless network, etc.)
(Other variations of clients are possible: e.g., in some object DBMSs, more functionality is transferred to clients
including data dictionary functions, optimization and recovery across multiple servers, etc.)

 Server
 Provides database query and transaction services to the
clients

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Two-Tier Client/Server
Architectures (cont'd.)
 Applications running on clients can utilize an
Application Program Interface (API) to access
server databases (to call the DBMSs) via
standard interface such as:
 JDBC: Java Database Connectivity for Java programming access
 ODBC: Open Database Connectivity for other programming access
 Both client and server must install appropriate
client module and server module software for
ODBC or JDBC

(JDBC: Allows Java client programs to access one or more DBMSs through a standard interface)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Three-Tier and n-Tier
Architectures for Web
Applications
 Application server or Web server
 Adds intermediate layer between client and the database
server
 Runs application programs and stores business rules

 Three-tier Architecture can enhance security


 Database server only accessible via middle tier
 Clients cannot directly access database server

 N-tier
 Divide the layers between the user and the stored data further
into finer components

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Classification of Database
Management Systems?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Classification of Database
Management Systems
 Data model
• Relational
• Object
• Object-relational
• Hierarchical and network (legacy)
• Native XML DBMS

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Classification of Database
Management Systems (cont'd.)
 Number of users
• Single-user
• Multiuser
 Number of sites
• Centralized
• Distributed
• Homogeneous
• Heterogeneous

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Classification of Database
Management Systems (cont'd.)
 Cost
• Open source (e.g. MySQL, PostgreSQL)
• Different types of licensing
• Different licensing options: site license, maximum number of
concurrent users (seat license), single user, etc.
• Commercial DBMS offer additional specialized modules, e.g. time-
series module, spatial data module, document module, XML module
• These offer additional specialized functionality when purchased
separately

 Types of access path options


 General or special-purpose

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


History of Data Models
 Network Model
 Hierarchical Model
 Relational Model
 Object-oriented Data Models
 Object-Relational Models

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Network Model
 The first network DBMS was implemented by
Honeywell in 1964-65 (IDS System).
 Adopted heavily due to the support by
CODASYL (Conference on Data Systems
Languages) (CODASYL - DBTG report of
1971).
 Later implemented in a large variety of
systems - IDMS (Cullinet - now Computer
Associates), DMS 1100 (Unisys), IMAGE (H.P.
(Hewlett-Packard)), VAX -DBMS (Digital
Equipment Corp., next COMPAQ, now H.P.).

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Network Model

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Network Model
 Advantages:
 Network Model is able to model complex relationships and
represents semantics of add/delete on the relationships.
 Can handle most situations for modeling using record types
and relationship types.
 Language is navigational; uses constructs like FIND, FIND
member, FIND owner, FIND NEXT within set, GET, etc.
• Programmers can do optimal navigation through the database.
 Disadvantages:
 Navigational and procedural nature of processing
 Database contains a complex array of pointers that thread
through a set of records.
• Little scope for automated “query optimization”

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Hierarchical Data Model
 Initially implemented in a joint effort by IBM
and North American Rockwell around 1965.
Resulted in the IMS family of systems.
 IBM’s IMS product had (and still has) a very
large customer base worldwide
 Hierarchical model was formalized based
on the IMS system
 Other systems based on this model:
System 2k (SAS inc.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Hierarchical Data Model
 Advantages:
 Simple to construct and operate
 Corresponds to a number of natural hierarchically organized
domains, e.g., organization (“org”) chart
 Language is simple:
• Uses constructs like GET, GET UNIQUE, GET NEXT, GET
NEXT WITHIN PARENT, etc.
 Disadvantages:
 Navigational and procedural nature of processing
 Database is visualized as a linear arrangement of records
 Little scope for "query optimization"

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Relational Model
 Proposed in 1970 by E.F. Codd (IBM), first
commercial system in 1981-82.
 Now in several commercial products (e.g. DB2,
ORACLE, MS SQL Server, SYBASE, INFORMIX).
 Several free open source implementations, e.g.
MySQL, PostgreSQL
 Currently most dominant for developing database
applications.
 SQL relational standards: SQL-89 (SQL1), SQL-92
(SQL2), SQL-99, SQL3, …

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Object-oriented Data Models
 Several models have been proposed for
implementing in a database system.
 One set comprises models of persistent O-O
Programming Languages such as C++ (e.g., in
OBJECTSTORE or VERSANT), and Smalltalk (e.g.,
in GEMSTONE).
 Additionally, systems like O2, ORION (at MCC - then
ITASCA), IRIS (at H.P.- used in Open OODB).
 Object Database Standard: ODMG-93, ODMG-
version 2.0, ODMG-version 3.0.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Object-Relational Models
 Most Recent Trend. Started with Informix
Universal Server.
 Relational systems incorporate concepts from
object databases leading to object-relational.
 Exemplified in the latest versions of Oracle-
10i, DB2, and SQL Server and other DBMSs.
 Standards included in SQL-99 and expected
to be enhanced in future SQL standards.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Summary
 Concepts used in database systems
 Main categories of data models
 Schemas, Instances, and States
 Types of languages supported by DMBSs
 Interfaces provided by the DBMS
 DBMS classification criteria:
 Data model, number of users, number of sites,
access paths, cost

Copyright © 2011 Ramez Elmasri and Shamkant Navathe

You might also like