Fall 2013 - M359 - Week 2 Presentation
Fall 2013 - M359 - Week 2 Presentation
Fall 2013 - M359 - Week 2 Presentation
1975 report proposed a standard set of terminology and a three-level architecture for database systems (it never became a standard, and is rarely the basis for vendor implementations of database systems, but it does offer a useful mechanism for considering DB functionality). The architecture attempts to separate user views of the data from the systems physical representation of the data.
it uses, but the definition of how data is stored in a file will be used only by the DBMS, rather than by any user process.
3
Three-schema architecture
To satisfy these needs for definitions of data in a way which provides data independence, we introduce what is called the three-schema architecture (a general architecture applicable to any class of DBMS, not just relational DBMSs).
Three-schema architecture
External schema(s)
mapping
Mapping
5
Logical schema
Storage schema
Three-schema architecture
A logical schema
The central component in this architecture Defines the logical properties of data in a database i.e.
(DDL) can be used to define this logical representation. Ex: Create Table Customers (ID char(6), Name char(40),)
6
Three-schema architecture
A storage schema
Defines how a database is stored in files and accessed
(using indexes to help in searching particular columns). Can specify how data is to be stored only if the data is already defined by an associated logical schema (i.e. storage schema definitions must relate to the definitions in a logical schema).
The connection between the logical and storage schema is
known as a mapping.
A logical schema determines what data is available in a
database, while a storage schema determines where and how data is stored in files.
7
Three-schema architecture
An external schema
Defines data for a user process or, more specifically,
defines data from a database, which the user process retrieves or updates by interaction with a DBMS. Is concerned with data as it is presented outside the DBMS to user processes.
External schema can specify how data is available to a user
process only if that data has already been defined by a logical schema.
External schema definitions must relate to those in a logical
schema, and each external schema must have a mapping from its associated logical schema.
8
Three-schema architecture
For a relational DBMS, an external schema is defined as
logical schema By including only some of the columns and some of the rows of the logical table By including data derived from the data described by the logical schema.
The external schema is independent of the storage schema
Three-schema architecture
10
Three-schema architecture
Exercise 3.1 (p. 71) Explain why an external schema should be considered as defining a logical representation of data.
An external schema defines what data is required by a user process, and is represented in the same way as a logical schema, so it should be considered as defining a logical representation of data.
Three-schema architecture
Three-schema architecture provides data independence:
Logical data independence exists when a change to a
logical schema has no impact on a user process. This is possible because each user process is dependant only on an external schema.
Physical data independence exists when a change to a
storage schema has no impact on a user process. This is possible because such a change does not affect a logical schema, since the change can be taken into account by a change to the mapping for the storage schema.
12
Three-schema architecture
The three-schema architecture is a general model,
and so a storage schema is NOT required this is a weakness of such implementations of relational technology.
However, most DBMSs permit indexes to be created and
added to table descriptions when required. An index in a DBMS can be used to speed searches for indexed values. Indexes are not directly concerned with how data is stored, only with how it is accessed.
13
Three-schema architecture
Where do indexes fit in the three-schema architecture? An index adds an extra description of the way in which data content is stored, but does not affect the way in which the data is physically stored. Thus, Indexes are sometimes considered not to be part of a storage schema.
Indexes are definitely not part of a logical schema because
they are not concerned with any logical property of the data itself. Consequently, indexes do not appear to fit anywhere in the three-schema architecture!!
14
Three-schema architecture
Logical schema is concerned with tables whose data is
stored in a database, also known as base tables. Thus, a logical schema in a relational DB is the collection of the descriptions of all the properties of the base tables.
In the relational model, you can also have a derived table
known as view. A view is a table without stored data whose values are defined by evaluating a defined query involving one or more base tables or other views.
In the context of the three-schema architecture, the view
could be seen as part of an external schema because it is describing data to be used by a user process.
15
16
Data definition
This DBMS function is concerned with the definition of
properties of data and, more generally, with the creation and modification of a schema in which data definitions are kept.
A DBMS performs this function in response to a request from
a user process that gives users the capability to define data. Such a request includes a specification of some properties of data that a DBMS then incorporates within a schema. For example, include name & data type for columns. Create Table Customers (ID char(6), Name char(40),)
Relational DBMS allows new tables to be added at any time,
Data manipulation
This DBMS function provides the means to manipulate
specific data in a database, which includes retrieving and updating : adding, deleting or changing data.
A DBMS performs this function in response to a request from
a user process, which provides a specification of what data is to be retrieved or updated. Insert into Customers (ID char(6), Name char(40),)
An example of one kind of manipulation, derived data such as
counts, totals and averages can be calculated by a DBMS for display by a user process.
18
Transaction support
A transaction specifies a sequence of updates to be executed
Transfer of funds from one bank account to another: The transaction consists of two steps the first step removes the money from the source bank account; the second step places the money into the destination account.
19
Transaction support
The current transaction can be committed, that is, all the
data changes within it are made permanent, by issuing a COMMIT instruction at the end of the transaction.
The changes can be discarded by issuing a ROLLBACK
instruction, which has the effect of taking the database back to the state it was in before that transaction began.
A transaction is used to control sharing of data because
the result of a transaction execution is visible to other users of the system only at the end of the transaction.
20
Concurrency support
Concurrency needs to provide users with the perception
that they can access any data value whenever they want, without being affected by other users who want to access the same data at the same time.
If all users just want to retrieve that value, there is no
problem they all see the same value. For updating, there are potential problems when many users want to change the same data.
access a database at the same time in a way that prevents them interfering with each other.
21
Security management
A key responsibility for the DBMS is ensuring that the
including ensuring:
The integrity of the data The correctness of the processing of the data Only authorized users access the data
22
Constraint management
Constraints are properties of data. They are included in a
that data in a database never violates the constraint that is a constraint is enforced automatically.
Constraints are checked whenever a request to the DBMS
would result in changes to the data that might affect the constraint.
Many DBMSs allow constraint enforcement to be switched
OFF either temporarily or permanently (this control should be exercised with great caution).
23
Access control
Another security management controlling what users can do
RIGHTS. These rights are included in a schema as part of the definition of a database. Example: The right to retrieve only a specific table or to update values in a specific column in specific table.
24
way of reconstructing the database. Recovery is more than just using a backup copy of data from the night before.
ALL data up to the time of the failure has to be recovered!
Example: A bank will want to recover the effect of every transaction right up until the moment of the system failure.
An implicit part of this function is the continuous logging of all
changes, providing a copy of the updated data so that it can be used by the recovery process if needed (logging can often be switched on and off).
25
legislation) or from internal organization choice (improving the way a business operates).
There are two ways to react:
Restructuring Reorganization
26
Restructuring
Restructuring of the database occurs to meet changes in
requirements (if the definition of a table is changed to have an additional column, the storage of existing rows of that table must also be changed to allow for the new column).
This is performed by the restructure function of the DBMS,
so will lead to changes in the storage schema, and may involve changes to the external schemas to make parts of the restructured database available to user processes.
27
Reorganization
Reorganization applies to the storage schema and
database, and the user processes using the database, is important and falls below some threshold.
Porting and implementation maintenance: DBMS or
the underlying computer system undergoes changes that require the storage implementation to be revised.
28
Inside a DBMS
Normal users, designers and application developers should
NOT need to be aware of the internal components of the system; they should be concerned only with its functionality.
Knowledgeable users (responsible in achieving high
performance or specific security requirements), will need to know how to control the individual components.
There is usually a variety of tools available to database
administrators (DBAs) to enable them to interact effectively with the DBMS internal systems.
In the Sybase environment , the Sybase Central application
store data related to the DBMS and the databases, users, security requirements and so on.
These are referred to collectively as the DBMS system
of the data managed by the DBMS. Every user database created within the DBMS will have its description stored in the DBMS schema in various system tables.
30
(p. 78)
(m359 / m359 /ODBC Data Source name type/choose the name WaltonDSN) Explore and Enjoy
31
Expand this
32
33
34
35
36
37
38
39
40
41
One of the few places in Sybase Central where we can affect the internal physical storage schema (pages).
This option controls the effect of moving data between the hard disk and the main memory of a computer system as a series of pages.
42
43
44
45
46
47
Some personal computer packages will include a DBMS and a user process which together operate as a single unit. Such systems are examples of monolithic systems in which all the components are interdependent and cannot be separated.
distinguishable components which communicate with one another through clearly defined interfaces; (you were introduced to the Sybase DBMS and the end user tools ISQL and Sybase Central)
48
play the role of a client, with the DBMS playing the role of a server.
It is possible to have centralized clientserver systems in
which both the client and server exist in the same computer system. However, it is possible to distribute the client and server across a network (distributed system).
49
Clientmultiserver systems
Distributed systems Replicated systems Mobile database systems
50
Client-Server systems
Client is usually responsible for the user interface and the
51
Client-Server systems
A three-tier system is one in which the client is reduced to
implementing the business application logic (user process); this, in turn, interacts with a db server responsible for DB.
53
more clients to be connected to the server faster response time since the network is usually slower than local processing
54
which could be managed by different DBMSs. Here, the user process needs to serially connect to multiple database servers and combine the results in order to satisfy a particular requirement.
Suppose a universitys logical database is physically
split according to different geographic regions and there is a DBMS server in each regional office.
Each region physically stores data for only their
students in the university, such as a request to count all students on specific courses, or to change staff salaries globally;
56
that also allows a client to communicate with different servers. Most database environments support this way of working by providing SQL statements for connection management.
57
databases involved and which databases were responsible for which data items.
Distributed, replicated and mobile systems are designed
Users and the processes they interact with do not have to know anything about where data is stored nor how it is processed. This is referred to as location independence. Distributed DBMS Global logical schema Distribution schema vs. vs. vs. Local DBMS Logical schema Storage schema
59
integrity of the distributed database. It must be able to recover from any failure (computer or network).
3. There must be a security process (unauthorized access)
4. DDBMS should work the same way whether it uses a PC or
Golden rule in Distributed DBMS: Place data so that most processing is performed locally
61
Replicated systems
Replication provides copies of data that are distributed to
other database servers so that each user process can perform all processing locally.
Advantages:
Makes data available through local access Minimizes the use of remote connections Reduces dangers of remote system/connection failure
selected items within a database. Also, replicated items can be copied to every system within the replicated environment, or only to selected systems.
63
Replicated systems
A specialized replication service is required to ensure that copies
64
Replicated systems
A replication server, at each system, both sends and
receives data and so has two main roles: to receive or collect updates from its local database and pass copies on to other systems that require them to receive or collect updates from remote systems, via their replication servers, and apply them to the local database.
Who initiates a transfer of replicates?
Sender : Replicates are pushed (broadcast, subscript,..)
Receiver: Replicates must be pulled (demand-driven sys.)
65
Replicated systems
The replication server stores descriptions of any
even though there may be a system failure at the remote location that holds the primary copy of the data
A database designer will decide the most appropriate
data items to replicate on each system. He will also determine which forms of replication service to use.
66
in which parts of a global database are fragmented and/or copied to the mobile devices in much the same way that the fixed networked distributed and replicated systems operate.
In a mobile DBMS the assumption is that for most of the time
these distributed and replicated data sets are not accessible to the global environment. Such systems depend on the mobile components reconnecting periodically in order to exchange data with the other systems.
67
web pages to directly interact with the content held in the DBMS without the need for an independent application service layer.
XML-enabled DBMS: these has built-in capabilities to store
and retrieve complex XML documents without the need for additional supporting software.
68
data management:
What data is needed within the organization,
the technical data management division and the business users of the data.
In such a role it is a requirement that the DA understands the
business domain and its data needs and is aware of the technical possibilities and limitations of the systems that are requested to meet these needs.
A typical DA will oversee the planning and development of
new data systems and the maintenance and control of existing systems.
71
hardware systems.
Responsible for translating the logical database design into a
data to and from the DBMS interfaces as well as handling error conditions.
The AD will treat the database and DBMS as a given, that is,
bar-codes through a bar-code reader and has no need to know what is happening to the data captured by the reader other than knowing it is a requirement if the customers bill and check-out receipt is to be accurate.
75
security and protection designed to prevent widespread damage if permitted access boundaries are breach.
Usually, threats in DB systems are to the large and valuable
data sets and the real world assets that they represent.
76
is used for some particular set of tasks (e.g. Walton app.) Can only be used for one purpose - the application for which it was specially designed and constructed; the user cannot change it.
Database tool
A general-purpose package that can be used with many different
databases, one at a time, by any authorized user, to access any data available to that user within a database (e.g. Sybase Central & ISQL). Provides different ways in which data can be processed & displayed. Should offer flexibility to the user
77
retrieved data.
Application development tools: allow the production of stand-alone
application processes without the need to use conventional programming languages directly.
Management tools: focus on the control and management of the
Lets look in detail at the use of SQL within some types of user processes
79
conventional programming language (ex: Java) combined with other statements written using the DML specific to that DBMS.
The programming language being used is said to host the DML
80
2. Connect to Walton DB
3. Type the SQL DML statement:
SELECT title, author FROM book WHERE type = 'e-book
ISQL is an example of a direct entry database tool (generalpurpose software, intended for use with many DBs and does not contain any fixed embedded DBMS).
81
2. Connect to Walton DB
3. Open the file showhighvaluestock.sql located in
The use of such a pre-prepared file is a convenience. This allows you to prepare SQL statements using any text editor and without having to connect to the DBMS until you want to evaluate your queries.
82
83
entry and manipulation forms have database actions associated with them.
When a user selects a management function, that
or the specification of the data manipulation for their activities. However, the database tool allows you to examine and modify the specification which is not possible in the application process.
84
85
expressed using the database language, SQL: either embedded, directly entered by users, or stored and accessed in tool libraries.
An SQL statement has to be transferred to a DBMS for
execution. You may imagine this as a procedure call in which an SQL statement is just a parameter, as CALL execute_this_SQL (an SQL statement)
86
chooses to implement the details of each procedure call differently for their DBMS (different parameters and different formats..). Such an interface is said to be proprietary (or native) to that DBMS vendor.
An immediate problem with having different proprietary
DBMS interfaces is that any database tool which wants to interact with a DBMS via its interface has to express all interaction in terms of those proprietary procedures.
87
procedures has been defined for common use, known as Open DataBase Connectivity (ODBC), because they provide an open(standard) interface to a DBMS
its interaction with a DBMS in a single standardized way. Suppliers who want their DBMS to be accessible to database tools will supply an ODBC interface for their products. 88