Chapter-1: Introduction

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 128

CHAPTER-1: INTRODUCTION

1
INTRODUCTION

This document describes the procedure required to migrate a database from Windows to Linux
using the RMAN Convert Database command. Both Windows and Linux platforms have the
same endian format, which makes possible to transfer the whole database, making the migration
process very straightforward and simple. To migrate between platforms that have a different
endian format, Cross Platform Transportable Tablespaces (XTTS)needs to be used instead.
For IT managers, database migration has become one of the most routine and challenging facts
of life. Workload balancing, technology refresh, server and storage consolidation, data center
relocation, data classification, and mergers/acquisitions are all drivers to move data from one
storage device to another on a regular basis. You’d think that any operation performed routinely
would become easy, but it is not that simple which it seems to be. Migration moving data from
one device to another (especially as the capacity of storage devices continues to grow) and then
redirecting all I/O to the new device is an inherently disruptive process. With the increase in the
percentage of mission-critical data and the proportionate increase in data availability demands,
downtime with its huge impact on a company’s financial bottom line becomes unacceptable. In
addition, business, technical and operational requirements impose challenging restrictions on the
migration process itself. Resource demands staff, CPU cycles, and bandwidth and risks
application downtime, performance impact to production environments, technical
incompatibilities, and data corruption/loss make migration one of IT’s biggest challenges.

2
1.1 DATA

The term data means groups of information that represent the qualitative or quantitative
attributes of a variable or set of variables. Data (plural of "datum", which is seldom used) are
typically the results of measurements and can be the basis of graphs, images, or observations of a
set of variables. Data are often viewed as the lowest level of abstraction from which information
and knowledge are derived. Raw data refers to a collection of numbers, characters, images or
other outputs from devices that collect information to convert physical quantities into symbols
that are unprocessed. Raw data refers to a collection of numbers, characters, images or other
outputs from devices to convert physical quantities into symbols, that are unprocessed. Such data
is typically further processed by a human or input into a computer, stored and processed there, or
transmitted (output) to another human or computer (possibly through a data cable). Raw data is a
relative term; data processing commonly occurs by stages, and the "processed data" from one
stage may be considered the "raw data" of the next. The terms information and knowledge are
frequently used for overlapping concepts. The main difference is in the level of abstraction being
considered. Data is the lowest level of abstraction, information is the next level, and finally,
knowledge is the highest level among all three.

1.2 INFORMATION

Information as a concept bears a diversity of meanings, from everyday usage to technical


settings. Generally speaking, the concept of information is closely related to notions of
constraint, communication, control, data, form, instruction, knowledge, meaning, mental
stimulus, pattern, perception, and representation.

3
CHAPTER-2: INTRODUCTION TO

DATABASE

4
INTRODUCTION TO DATABASE

DATABASE

A database consists of an organized collection of data for one or more multiple uses. Databases
consist of software-based "containers" that are structured to collect and store information so
users can retrieve, add, update or remove such information in an automatic fashion. They do so
through the use of tables .Database programs are designed for users so that they can add or delete
any information needed. The structure of a database is tabular, consisting of rows and columns of
information.

APPLICATIONS

Databases function in many applications, spanning virtually the entire range of computer
software. Databases have become the preferred method of storage for large multiuser
applications, where coordination between many users is needed. Even individual users find them
convenient, and many electronic mail programs and personal organizers are based on standard
database technology. Software database drivers are available for most database platforms so that
application software can use a common API to retrieve the information stored in a database.
Commonly used database APIs include JDBC and ODBC.

DATABASES IN NEW MEDIA

Within new media, databases collect items on which the user can carry out various operations
such as viewing, navigating, creates, and searching. Though there are various types of items
within the database, each item has the same significance. Unlike a narrative or film, the
collections are computerized and therefore may offer a unique experience with each view. This
form of data may present a unique presentation of what the world is like. Databases can be seen
as a symbolic form of the computer age.

5
2.1 COMPONENTS OF DATABASE

A database is developed with several components. All these components are linked to each other.
Each has its own role to play in the developing and working of database. Some of the main
components which are used in developing and working of database are:

 COLUMNS

 ROWS

 TABLES

 KEYS

 FORMS

 REPORTS

 QUERY

Table, rows and columns form the building blocks of a database. They store the data that we
want to save in our database

2.1.1 COLUMNS

Columns are akin to fields, that is, individual items of data that we wish to store. A customer's
name, the price of a part, the date of an invoice is all examples of columns. They are also similar
to the columns found in spreadsheets (the A, B, C etc along the top).

2.1.2 ROWS

Rows are akin to records as they contain data of multiple columns (like the 1,2,3 etc in a
spreadsheet). Unlike file records though, it is possible to extract only the columns you want to
make up a row of data. Old "records" that computers read forced the computer to read
everything, even if you only wanted a tiny portion of the record. In databases, a row can be made

6
up of as many or as few columns as you want. This makes reading data much more efficient -
you fetch what you want.

2.1.3 TABLE

A table is a logical group of columns. For example, you may have a table that stores details of
customers' names and addresses. Another table would be used to store details of parts and yet
another would be used for supplier's names and addresses. It is the tables that make up the entire
database and it is important that we do not duplicate data at all. Only keys would duplicate (and
even then, on some tables - these would be unique).

2.1.4 KEYS

A key is a single or combination of multiple fields. Its purpose is to access or retrieve data rows
from table according to the requirement. The keys are defined in tables to access or sequence the
stored data quickly and smoothly. They are also used to create links between different tables.

TYPES OF KEYS

The following tables or relations will be used to define different types of keys.

PRIMARY KEY

A primary key is a value that can be used to identify a unique row in a table. Attributes are
associated with it. Examples of primary keys are Social Security numbers (associated to a
specific person) or ISBNs (associated to a specific book). In the relational model of data, a
primary key is a candidate key chosen as the main method of uniquely identifying a tuple in a
relation.

7
SECONDARY KEY

A field or combination of fields that is basis for retrieval is known as secondary key. Secondary
key is a non-unique field. One secondary key value may refer to many records.

COMPOSITE KEY

A primary key that consists of two or more attributes is known as composite key.

CANDIDATE KEY

A candidate key is a field or combination of fields that can act as a primary key field for that
table to uniquely identify each record in that table.

ALTERNATE KEY

An alternate key is any candidate key which is not selected to be the primary key.

SUPER KEY

A super key is defined in the relational model as a set of attributes of a relation variable (relvar)
for which it holds that in all relations assigned to that variable there are no two distinct tuples
(rows) that have the same values for the attributes in this set. Equivalently a superkey can also be
defined as a set of attributes of a relvar upon which all attributes of the relvar are functionally
dependent.

FOREIGN KEY

A foreign key (FK) is a field or group of fields in a database record that points to a key field or
group of fields forming a key of another database record in some (usually different) table.
Usually a foreign key in one table refers to the primary key (PK) of another table. This way

8
references can be made to link information together and it is an essential part of database
normalization.

2.1.5 FORMS

Forms are used to enter or look at data. They are the screens that make up the application that
allows us to enter and retrieve data. Each "field" on a form usually relates to a column on the
database but this is not always the case.

For example, suppose you have a field that shows a total of two or more items on a form. The
total would NOT be stored on on the database. Why? Because if one of the columns were
incorrectly updated, you wouldn't know which was incorrect? For example. Suppose you had 3
columns on a table - A, B and C. A is equal to B + C. suppose the columns contain the following
values:

A = 11

B=7

C=5

Which column is wrong? You might think that A is wrong, but perhaps B should be equal to 6 or
maybe C should be 4. You just don't know. This is why derived items are not stored on the
database but can be shown on a form. The form should have some form of validation so that the
user cannot enter "rubbish" onto the database. Numbers must store numbers; dates should store
dates and so on. Depending on the application, there can be some very complex validation.

2.1.6 REPORTS

9
It is all very well placing all this data into the database, but if you cannot access it in a format
that you want (usually on paper), then it is extremely limited in use. Every database should have
some kind of reporting facility and I usually find that the report side of things is what makes a
good (or bad) database application/package. Reports should be easy to produce and be quick.
Simple reports should be so easy to produce that anyone can do it. More complex reports will
need someone with programming/database knowledge, as it is possible to produce really silly
results.

2.1.7 QUERY

Queries are little programs that let the user ask questions. Things like "Give me a list of all the
customers who ordered something last month" or "List all customers in a specific area". Queries
can and often are the foundation of reports (in that a report can be based on the results of a query.
Depending on the database, queries can be constructed on screen or you may even be able to type
in a query command direct. Queries often use a language called "SQL" (Structured Query
Language) which is a computer language that makes accessing databases simple. These then are
the general components of what makes up a database. Next week, we take a simple look at some
basic design concepts for your tables.

2.2 TYPES OF DATABASE

There is various kind of database which exists in the computer world. They are differentiated on
their functionality and how they model the data.

2.2.1 BASED ON FUNCTIONALITY

10
OPERATIONAL DATABASE

These databases store detailed data needed to support the operations of an entire organization.
They are also called subject-area databases (SADB), transaction databases, and production
databases. For example:

 Customer database
 Personal database
 Inventory database
 Accounting database

ANALYTICAL DATABASE

Analytic databases (a.k.a. OLAP- On Line Analytical Processing) are primarily static, read-only
databases which store archived, historical data used for analysis. For example, a company might
store sales records over the last ten years in an analytic database and use that database to analyze
marketing strategies in relationship to demographics. On the web, you will often see analytic
databases in the form of inventory catalogs such as the one shown previously from Amazon.com.
An inventory catalog analytical database usually holds descriptive information about all
available products in the inventory. Web pages are generated dynamically by querying the list of
available products in the inventory against some search parameters. The dynamically-generated
page will display the information about each item (such as title, author, ISBN) which is stored in
the database.

DATA WAREHOUSE

A data warehouse stores data from current and previous years data extracted from the various
operational databases of an organization. It becomes the central source of data that has been
screened, edited, standardized and integrated so that it can be used by managers and other end-

11
user professionals throughout an organization. Data warehouses are characterized by being slow
to insert into but fast to retrieve from. Recent developments in data warehousing have led to the
use of a Shared nothing architecture to facilitate extreme scaling.

DISTRIBUTED DATABASE

These are databases of local work-groups and departments at regional offices, branch offices,
manufacturing plants and other work sites. These databases can include segments of both
common operational and common user databases, as well as data generated and used only at a
user’s own site.

END-USER DATABASE

These databases consist of a variety of data files developed by end-users at their workstations.
Examples of these are collections of documents in spreadsheets, word processing and even
downloaded files.

EXTERNAL DATABASE

These databases provide access to external, privately-owned data online — available for a fee to
end-users and organizations from commercial services. Access to a wealth of information from
external database is available for a fee from commercial online services and with or without
charge from many sources in the Internet.

12
HYPERMEDIA DATABASES ON THE WEB

These are a set of interconnected multimedia pages at a web-site. They consist of a home page
and other hyperlinked pages [of multimedia or mixed media such as text, graphic, photographic
images, video clips, audio etc.

NAVIGATIONAL DATABASE

In navigational databases, queries find objects primarily by following references from other
objects. Traditionally navigational interfaces are procedural, though one could characterize some
modern systems like XPath as being simultaneously navigational and declarative.

IN-MEMORY DATABASES

In-memory databases primarily rely on main memory for computer data storage. This contrasts
with database management systems which employ a disk-based storage mechanism. Main
memory databases are faster than disk-optimized databases since the internal optimization
algorithms are simpler and execute fewer CPU instructions. Accessing data in memory provides
faster and more predictable performance than disk. In applications where response time is
critical, such as telecommunications network equipment that operates emergency systems, main
memory databases are often used.

DOCUMENT-ORIENTED DATABASES

Document-oriented databases are computer programs designed for document-oriented


applications. These systems may be implemented as a layer above a relational database or an
object database. As opposed to relational databases, document-based databases do not store data
in tables with uniform sized fields for each record. Instead, they store each record as a document

13
that has certain characteristics. Any number of fields of any length can be added to a document.
Fields can also contain multiple pieces of data.

REAL-TIME DATABASES
A real-time database is a processing system designed to handle workloads whose state may
change constantly. This differs from traditional databases containing persistent data, mostly
unaffected by time. For example, a stock market changes rapidly and dynamically. Real-time
processing means that a transaction is processed fast enough for the result to come back and be
acted on right away. Real-time databases are useful for accounting, banking, law, medical
records, multi-media, process control, reservation systems, and scientific data analysis. As
computers increase in power and can store more data, real-time databases become integrated into
society and are employed in many applications.

2.2.2 MODEL BASED

HIERARCHICAL DATABASES

As its name implies, the Hierarchical Database Model defines hierarchically-arranged data.
Perhaps the most intuitive way to visualize this type of relationship is by visualizing an upside
down tree of data. In this tree, a single table acts as the "root" of the database from which other
tables "branch" out. You will be instantly familiar with this relationship because that is how all
windows-based directory management systems (like Windows Explorer) work these days.
Relationships in such a system are thought of in terms of children and parents such that a child
may only have one parent but a parent can have multiple children. Parents and children are tied
together by links called "pointers" (perhaps physical addresses inside the file system). A parent
will have a list of pointers to each of their children.

14
This child/parent rule assures that data is systematically accessible. To get to a low-level table,
you start at the root and work your way down through the tree until you reach your target. Of
course, as you might imagine, one problem with this system is that the user must know how the
tree is structured in order to find anything.

NETWORK DATABASES

In many ways, the Network Database model was designed to solve some of the more serious
problems with the Hierarchical Database Model. Specifically, the Network model solves the
problem of data redundancy by representing relationships in terms of sets rather than hierarchy.
The model had its origins in the Conference on Data Systems Languages (CODASYL) which
had created the Data Base Task Group to explore and design a method to replace the hierarchical
model. The network model is very similar to the hierarchical model actually. In fact, the
hierarchical model is a subset of the network model. However, instead of using a single-parent
tree hierarchy, the network model uses set theory to provide a tree-like hierarchy with the
exception that child tables were allowed to have more than one parent. This allowed the network
model to support many-to-many relationships. Visually, a Network Database looks like a
hierarchical Database in that you can see it as a type of tree. However, in the case of a Network
Database, the look is more like several trees which share branches. Thus, children can have
multiple parents and parents can have multiple children.

15
Nevertheless, though it was a dramatic improvement, the network model was far from perfect.
Most profoundly, the model was difficult to implement and maintain. Most implementations of
the network model were used by computer programmers rather than real users. What was needed
was a simple model which could be used by real end users to solve real problems.

2.3 ARCHITECTURE OF DATABASE

A number of database architectures exist. Many databases use a combination of strategies.


Databases consist of software-based "containers" that are structured to collect and store
information so users can retrieve, add, update or remove such information in an automatic
fashion. Database programs are designed for users so that they can add or delete any information
needed. The structure of a database is tabular, consisting of rows and columns of information.
Online Transaction Processing systems (OLTP) often use a "row oriented" or an "object
oriented" data store architecture, whereas data-warehouse and other retrieval focused
applications like Google's Big Table, or bibliographic database (library catalog) systems may use
a Column oriented DBMS architecture. Document-Oriented, XML, knowledge bases, as well as
frame databases and RDF-stores (also known as triple stores), may also use a combination of
these architectures in their implementation. Not all databases have or need a database schema
("schema-less databases").Over many years general-purpose database systems have dominated
the database industry. These offer a wide range of functions, applicable to many, if not most
circumstances in modern data processing. These have been enhanced with extensible data types
(pioneered in the PostgreSQL project) to allow development of a very wide range of

16
applications. There are also other types of databases which cannot be classified as relational
databases. Most notable is the object database management system, which stores language
objects natively without using a separate data definition language and without translating into a
separate storage schema. Unlike relational systems, these object databases store the relationship
between complex data types as part of their storage model in a way that does not require runtime
calculation of related data using relational algebra execution algorithms.

2.3.1 THREE LEVEL DATABASE ARCHITECTURE

DATA AND RELATED STRUCTURES

Data are actually stored as bits, or numbers and strings, but it is difficult to work with data at this
level. It is necessary to view data at different levels of abstraction.

Schema:

 Description of data at some level. Each level has its own schema.

We will be concerned with three forms of schemas:

 Physical,
 Conceptual
 External.

17
PHYSICAL DATA LEVEL

The physical schema describes details of how data is stored: files, indices, etc. on the random
access disk system.  It also typically describes the record layout of files and type of files (hash, b-
tree, flat).Early applications worked at this level - explicitly dealt with details. E.g., minimizing
physical distances between related data and organizing the data structures within the file
(blocked records, linked lists of blocks, etc.)

Problem:

 Routines are hardcoded to deal with physical representation.


 Changes to data structures are difficult to make.
 Application code becomes complex since it must deal with details.
 Rapid implementation of new features very difficult.

 
CONCEPTUAL DATA LEVEL

Hides details of the physical level.

 In the relational model, the conceptual schema presents data as a set of tables.

The DBMS maps data access between the conceptual to physical schemas automatically.

 Physical schema can be changed without changing application:

18
 DBMS must change mapping from conceptual to physical.
 Referred to as physical data independence.

EXTERNAL DATA LEVEL

In the relational model, the external schema also presents data as a set of relations. An external
schema specifies a view of the data in terms of the conceptual level. It is tailored to the needs of
a particular category of users. Portions of stored data should not be seen by some users and
begins to implement a level of security and simplifies the view for these users

Examples:

 Students should not see faculty salaries.


 Faculty should not see billing or payment data.

Information that can be derived from stored data might be viewed as if it were stored.

 GPA not stored, calculated when needed.

Applications are written in terms of an external schema. The external view is computed when
accessed.  It is not stored. Different external schemas can be provided to different categories of
users. Translation from external level to conceptual level is done automatically by DBMS at run
time. The conceptual schema can be changed without changing application:

 Mapping from external to conceptual must be changed.


 Referred to as conceptual data independence

WHAT IS A DATA MODEL?


Well, essentially a data model is a "description" of both a container for data and a methodology
for storing and retrieving data from that container. Actually, there isn't really a data model
"thing". Data models are abstractions, oftentimes mathematical algorithms and concepts. You
cannot really touch a data model. But nevertheless, they are very useful. The analysis and design

19
of data models has been the cornerstone of the evolution of databases. As models have advanced
so has database efficiency.

2.4 DATABASE MODELS

Besides differentiating databases according to function, databases can also be differentiated


according to how they model the data. Before the 1980's, the two most commonly used Database
Models were the hierarchical and network systems. Let's take a quick look at these two models
and then move on to the more current models.

2.4.1 FLAT MODEL

The flat (or table) model consists of a single, two-dimensional array of data elements, where all
members of a given column are assumed to be similar values, and all members of a row are
assumed to be related to one another. For instance, columns for name and password that might be
used as a part of a system security database. Each row would have the specific password
associated with an individual user. Columns of the table often have a type associated with them,
defining them as character data, date or time information, integers, or floating point numbers.
This may not strictly qualify as a data model, as defined above.

2.4.2 HIERARCHICAL MODEL

20
In a hierarchical model, data is organized into a tree-like structure, implying a single upward link
in each record to describe the nesting, and a sort field to keep the records in a particular order in
each same-level list. Hierarchical structures were widely used in the early mainframe database
management systems, such as the Information Management System (IMS) by IBM, and now
describe the structure of XML documents. This structure allows one 1:N relationship between
two types of data. This structure is very efficient to describe many relationships in the real world;
recipes, table of contents, ordering of paragraphs/verses, any nested and sorted information.
However, the hierarchical structure is inefficient for certain database operations when a full path
(as opposed to upward link and sort field) is not also included for each record. Parent–child
relationship: Child may only have one parent but a parent can have multiple children. Parents
and children are tied together by links called "pointers". A parent will have a list of pointers to
each of their children.

2.4.3 NETWORK MODEL

21
The network model (defined by the CODASYL specification) organizes data using two
fundamental constructs, called records and sets. Records contain fields (which may be organized
hierarchically, as in the programming language COBOL). Sets (not to be confused with
mathematical sets) define one-to-many relationships between records: one owner, many
members. A record may be an owner in any number of sets, and a member in any number of sets.
The network model is a variation on the hierarchical model, to the extent that it is built on the
concept of multiple branches (lower-level structures) emanating from one or more nodes (higher-
level structures), while the model differs from the hierarchical model in that branches can be
connected to multiple nodes. The network model is able to represent redundancy in data more
efficiently than in the hierarchical model. The operations of the network model are navigational
in style: a program maintains a current position, and navigates from one record to another by
following the relationships in which the record participates. Records can also be located by
supplying key values. Although it is not an essential feature of the model, network databases
generally implement the set relationships by means of pointers that directly address the location
of a record on disk. This gives excellent retrieval performance, at the expense of operations such
as database loading and reorganization. Most object databases use the navigational concept to
provide fast navigation across networks of objects, generally using object identifiers as "smart"
pointers to related objects. Objectivity/DB, for instance, implements named 1:1, 1:many, many:1
and many: many named relationships that can cross databases. Many object databases also
support SQL, combining the strengths of both models.

22
2.4.4 RELATIONAL MODEL

The relational model was introduced by E.F. Codd in 1970 as a way to make database
management systems more independent of any particular application. It is a mathematical model
defined in terms of predicate logic and set theory. The products that are generally referred to as
relational databases in fact implement a model that is only an approximation to the mathematical
model defined by Codd. Three key terms are used extensively in relational database models:
Relations, attributes, and domains. A relation is a table with columns and rows. The named
columns of the relation are called attributes, and the domain is the set of values the attributes are
allowed to take. All relations (and, thus, tables) in a relational database have to adhere to some
basic rules to qualify as relations. First, the ordering of columns is immaterial in a table. Second,
there can't be identical tuples or rows in a table. And third, each tuple will contain a single value
for each of its attributes.

A relational database contains multiple tables, each similar to the one in the "flat" database
model. One of the strengths of the relational model is that, in principle, any value occurring in
two different records (belonging to the same table or to different tables), implies a relationship
among those two records. Yet, in order to enforce explicit integrity constraints, relationships
between records in tables can also be defined explicitly, by identifying or non-identifying parent-
child relationships characterized by assigning cardinality (1:1, (0)1:M, M:M). Tables can also

23
have a designated single attribute or a set of attributes that can act as a "key", which can be used
to uniquely identify each tuple in the table. A key that can be used to uniquely identify a row in a
table is called a primary key. Keys are commonly used to join or combine data from two or more
tables. For example, an Employee table may contain a column named Location which contains a
value that matches the key of a Location table. Keys are also critical in the creation of indexes,
which facilitate fast retrieval of data from large tables. Any column can be a key, or multiple
columns can be grouped together into a compound key. It is not necessary to define all the keys
in advance; a column can be used as a key even if it was not originally intended to be one.A key
that has an external, real-world meaning (such as a person's name, a book's ISBN, or a car's serial
number) is sometimes called a "natural" key. If no natural key is suitable (think of the many
people named Brown), an arbitrary or surrogate key can be assigned (such as by giving
employees ID numbers). In practice, most databases have both generated and natural keys,
because generated keys can be used internally to create links between rows that cannot break,
while natural keys can be used, less reliably, for searches and for integration with other
databases. (For example, records in two independently developed databases could be matched up
by social security number, except when the social security numbers are incorrect, missing, or
have changed.)

2.4.5 DIMENSIONAL MODEL

The dimensional model is a specialized adaptation of the relational model used to represent data
in data warehouses in a way that data can be easily summarized using OLAP queries. In the
dimensional model, a database consists of a single large table of facts that are described using
dimensions and measures. A dimension provides the context of a fact (such as who participated,
when and where it happened, and its type) and is used in queries to group related facts together.
Dimensions tend to be discrete and are often hierarchical; for example, the location might
include the building, state, and country. A measure is a quantity describing the fact, such as
revenue. It's important that measures can be meaningfully aggregated - for example, the revenue
from different locations can be added together. In an OLAP query, dimensions are chosen and
the facts are grouped and added together to create a summary. The dimensional model is often

24
implemented on top of the relational model using a star schema, consisting of one table
containing the facts and surrounding tables containing the dimensions. Particularly complicated
dimensions might be represented using multiple tables, resulting in a snowflake schema. A data
warehouse can contain multiple star schemas that share dimension tables, allowing them to be
used together. Coming up with a standard set of dimensions is an important part of dimensional
modeling.

2.4.6 OBJECT-ORIENTED MODELS

In recent years, the object-oriented paradigm has been applied to database technology, creating a
new programming model known as object databases. These databases attempt to bring the
database world and the application programming world closer together, in particular by ensuring
that the database uses the same type system as the application program. This aims to avoid the
overhead (sometimes referred to as the impedance mismatch) of converting information between
its representation in the database (for example as rows in tables) and its representation in the
application program (typically as objects). At the same time, object databases attempt to
introduce the key ideas of object programming, such as encapsulation and polymorphism, into
the world of databases. A variety of these ways have been tried for storing objects in a database.
Some products have approached the problem from the application programming end, by making
the objects manipulated by the program persistent. This also typically requires the addition of
some kind of query language, since conventional programming languages do not have the ability
to find objects based on their information content. Others have attacked the problem from the
database end, by defining an object-oriented data model for the database, and defining a database
programming language that allows full programming capabilities as well as traditional query.

25
Object databases suffered because of a lack of standardization: although standards were defined
by ODMG, they were never implemented well enough to ensure interoperability between
products. Nevertheless, object databases have been used successfully in many applications:
usually specialized applications such as engineering databases or molecular biology databases
rather than mainstream commercial data processing. However, object database ideas were picked
up by the relational vendors and influenced extensions made to these products and indeed to the
SQL language.

2.5 DATABASE DESIGN

Database design is the process of producing a detailed data model of a database. This logical data
model contains all the needed logical and physical design choices and physical storage
parameters needed to generate a design in a Data Definition Language, which can then be used to
create a database. A fully attributed data model contains detailed attributes for each entity. The
term database design can used to describe many different parts of the design of an overall
database system. Principally, and most correctly, it can be thought of as the logical design of the
base data structures used to store the data. In the relational model these are the tables and views.

26
In an object database the entities and relationships map directly to object classes and named
relationships. However, the term database design could also be used to apply to the overall
process of designing, not just the base data structures, but also the forms and queries used as part
of the overall database application within the database management system (DBMS). The
process of doing database design generally consists of a number of steps which will be carried
out by the database designer. Usually, the designer must:

* Determine the relationships between the different data elements.

* Superimpose a logical structure upon the data on the basis of these relationships

2.5.1 The Design Process

The design process consists of the following steps

 Determine the purpose of your database - This helps prepare you for the remaining
steps.
 Find and organize the information required - Gather all of the types of information
you might want to record in the database, such as product name and order number.
 Divide the information into tables - Divide your information items into major
entities or subjects, such as Products or Orders. Each subject then becomes a table.
 Turn information items into columns - Decide what information you want to store
in each table. Each item becomes a field, and is displayed as a column in the table.
For example, an Employees table might include fields such as Last Name and Hire
Date.
 Specify primary keys - Choose each table’s primary key. The primary key is a
column that is used to uniquely identify each row. An example might be Product ID
or Order ID.
 Set up the table relationships - Look at each table and decide how the data in one
table is related to the data in other tables. Add fields to tables or create new tables to
clarify the relationships, as necessary.

27
 Refine your design - Analyze your design for errors. Create the tables and add a few
records of sample data. See if you can get the results you want from your tables.
Make adjustments to the design, as needed.
 Apply the normalization rules - Apply the data normalization rules to see if your
tables are structured correctly. Make adjustments to the tables, as needed.

2.5.2 TYPES OF DATABASE DESIGN

CONCEPTUAL SCHEMA

Once a database designer is aware of the data which is to be stored within the database, they
must then determine where dependency is within the data. Sometimes when data is changed you
can be changing other data that is not visible. For example, in a list of names and addresses,
assuming a situation where multiple people can have the same address, but one person cannot
have more than one addresses, the name is dependent upon the address, because if the address is
different than the associated name is different too. However, the other way around is different.
One attribute can change and not another.

LOGICALLY STRUCTURING DATA

Once the relationships and dependencies amongst the various pieces of information have been
determined, it is possible to arrange the data into a logical structure which can then be mapped
into the storage objects supported by the database management system. In the case of relational
databases the storage objects are tables which store data in rows and columns. Each table may
represent an implementation of either a logical object or a relationship joining one or more
instances of one or more logical objects. Relationships between tables may then be stored as
links connecting child tables with parents. Since complex logical relationships are themselves
tables they will probably have links to more than one parent.

PHYSICAL DATABASE DESIGN

28
The physical design of the database specifies the physical configuration of the database on the
storage media. This includes detailed specification of data elements, data types, indexing options
and other parameters residing in the DBMS data dictionary. It is the detailed design of a system
that includes modules & the database's hardware & software specifications of the system.

2.6 DATABASE MANAGEMENT SYSTEM

Database Management System (DBMS) consists of software that organizes the storage of data.
A DBMS controls the creation, maintenance, and use of the database storage structures of social
organizations and of their users. It allows organizations to place control of organization wide
database development in the hands of Database Administrators (DBAs) and other specialists. In
large systems, a DBMS allows users and other software to store and retrieve data in a structured
way. Database management systems are usually categorized according to the database model that
they support, such as the network, relational or object model. The model tends to determine the
query languages that are available to access the database. One commonly used query language
for the relational database is SQL, although SQL syntax and function can vary from one DBMS
to another. A common query language for the object database is OQL, although not all vendors
of object databases implement this, majority of them do implement this method. A great deal of
the internal engineering of a DBMS is independent of the data model, and is concerned with
managing factors such as performance, concurrency, integrity, and recovery from hardware
failures. In these areas there are large differences between the products. The database holds the
records, fields, cells with data. The DBMS is the "tools" used to manipulate the data in the
database. An example of a DBMS is SQL Server. The DBMS is the database engine itself. This
is the core, the actual program that contains the logic to execute your SQL statements. Performs
that actual data access based upon these commands (relationally).

2.6.1 COMPONENTS OF DBMS

29
 DBMS Engine accepts logical request from the various other DBMS subsystems,
converts them into physical equivalents, and actually accesses the database and data
dictionary as they exist on a storage device.
 Data Definition Subsystem helps user to create and maintain the data dictionary and
define the structure of the files in a database.
 Data Manipulation Subsystem helps user to add, change, and delete information in a
database and query it for valuable information. Software tools within the data
manipulation subsystem are most often the primary interface between user and the
information contained in a database. It allows user to specify its logical information
requirements.
 Application Generation Subsystem contains facilities to help users to develop
transaction-intensive applications. It usually requires that user perform a detailed series of
tasks to process a transaction. It facilitates easy-to-use data entry screens,
programming languages, and interfaces.
 Data Administration Subsystem helps users to manage the overall database
environment by providing facilities for backup and recovery, security management,
query optimization, concurrency control, and change management.

2.6.2 PRIMARY TASKS OF DBMS PACKAGES

 Database Development: used to define and organize the content, relationships, and
structure of the data needed to build a database.
 Database Interrogation: can access the data in a database for information retrieval and
report generation. End users can selectively retrieve and display information and
produce printed reports and documents.
 Database Maintenance: used to add, delete, update, correct, and protect the data in a
database.

30
 Application Development: used to develop prototypes of data entry screens, queries,
forms, reports, tables, and labels for a prototyped application. Or use 4GL or 4th
Generation Language or application generator to develop program codes.

2.6.3 DBMS FEATURES AND CAPABILITIES

Alternatively, and especially in connection with the relational model of database management,
the relation between attributes drawn from a specified set of domains can be seen as being
primary. For instance, the database might indicate that a car that was originally "red" might fade
to "pink" in time, provided it was of some particular "make" with an inferior paint job. Such
higher entity relationships provide information on all of the underlying domains at the same
time, with none of them being privileged above the others. Thus, the DBMSs of today roll
together frequently needed services or features of attribute management. By externalizing such
functionality to the DBMS, applications effectively share code with each other and are relieved
of much internal complexity. Features commonly offered by database management systems:

QUERY ABILITY 

Querying is the process of requesting attribute information from various perspectives and
combinations of factors. Example: "How many 2-door cars in Texas are green?" A
database query language and report writer allow users to interactively interrogate the
database, analyze its data and update it according to the users privileges on data.

BACKUP AND REPLICATION 

Copies of attributes need to be made regularly in case primary disks or other equipment
fails. A periodic copy of attributes may also be created for a distant organization that
cannot readily access the original. DBMS usually provide utilities to facilitate the process

31
of extracting and disseminating attribute sets. When data is replicated between database servers,
so that the information remains consistent throughout the database system and users cannot tell
or even know which server in the DBMS they are using, the system is said to exhibit replication
transparency.

RULE ENFORCEMENT 

Often one wants to apply rules to attributes so that the attributes are clean and reliable. For
example, we may have a rule that says each car can have only one engine associated with it
(identified by Engine Number). If somebody tries to associate a second engine with a given car,
we want the DBMS to deny such a request and display an error message. However, with changes
in the model specification such as, in this example, hybrid gas-electric cars, rules may need to
change. Ideally such rules should be able to be added and removed as needed without significant
data layout redesign.

SECURITY 

Often it is desirable to limit who can see or change which attributes or groups of attributes. This
may be managed directly by individual, or by the assignment of individuals and privileges to
groups, or (in the most elaborate models) through the assignment of individuals and groups to
roles which are then granted entitlements.

COMPUTATION 

There are common computations requested on attributes such as counting, summing, averaging,
sorting, grouping, cross-referencing, etc. Rather than have each computer application implement
these from scratch, they can rely on the DBMS to supply such calculations.

32
CHANGE AND ACCESS LOGGING 

Often one wants to know who accessed what attributes, what was changed, and when it was
changed. Logging services allow this by keeping a record of access occurrences and changes.

AUTOMATED OPTIMIZATION 

If there are frequently occurring usage patterns or requests, some DBMS can adjust themselves
to improve the speed of those interactions. In some cases the DBMS will merely provide tools to
monitor performance, allowing a human expert to make the necessary adjustments after
reviewing the statistics collected.

2.7 RELATIONAL DATABASE MANAGEMENT SYSTEM

RDBMS stands for Relational Database Management System. RDBMS data is structured in
database tables, fields and records. Each RDBMS table consists of database table rows. Each
database table row consists of one or more database table fields. RDBMS store the data into
collection of tables, which might be related by common fields (database table columns). RDBMS
also provide relational operators to manipulate the data stored into the database tables. Most
RDBMS use SQL as database query language. Edgar Codd introduced the relational database
model. Many modern DBMS do not conform to the Codd’s definition of a RDBMS, but
nonetheless they are still considered to be RDBMS. The most popular RDBMS are MS SQL
Server, DB2, Oracle and MySQL.

2.7.1 RDBMS COMPONENTS

INTERFACE DRIVERS - A user or application program initiates either schema


modification or content modification. These drivers are built on top of SQL. They provide

33
methods to prepare statements execute statements, fetch results, etc. Examples include DDL,
DCL, DML, ODBC, and JDBC. Some vendors provide language-specific proprietary
interfaces. For example MySQL and Firebird provide drivers for PHP, Python

SQL ENGINE - This component interprets and executes the SQL query. It comprises three
major components (compiler, optimizer, and execution engine).

TRANSACTION ENGINE - Transactions are sequences of operations that read or write


database elements, which are grouped together.

RELATIONAL ENGINE - Relational objects such as Table, Index, and Referential


integrity constraints are implemented in this component.

STORAGE ENGINE - This component stores and retrieves data records. It also provides a
mechanism to store metadata and control information such as undo logs, redo logs, lock
tables, etc.

2.8 DATABASE - ADVANTAGES & DISADVANTAGES


  

ADVANTAGES

 Reduced data redundancy


 Reduced updating errors and increased consistency
 Greater data integrity and independence from applications programs
 Improved data access to users through use of host and query languages
 Improved data security

34
 Reduced data entry, storage, and retrieval costs
 Facilitated development of new applications program

DISADVANTAGES

 Database systems are complex, difficult, and time-consuming to design


 Substantial hardware and software start-up costs
 Damage to database affects virtually all applications programs
 Extensive conversion costs in moving form a file-based system to a database
system
 Initial training required for all programmers and users

35
CHAPTER-3: ORACLE
ARCHITECTURAL COMPONENTS
AND DATABASE CREATION

3.1 OVERVIEW OF PRIMARY COMPONENTS

36
Overview of Primary Components

37
1. Oracle Server:- There are several files, process, and memory structures in an
Oracle server, however, not all of them are used when processing a SQL
statement. Some are used to improve the performance of the database, ensure that
the database can be recovered in the event of a software or hardware error, or
perform other tasks necessary to maintain the database. The Oracle server consists
of an Oracle instance and an Oracle database.
2. Oracle Instance:- An Oracle instance is the combination of the background
processes and memory structure. The instance must be started to access the data in
the database. Every time an instance is started, a System Global Area (SGA) is
allocated and Oracle background processes are started. Background processes
perform functions on behalf of the invoking process. They consolidate functions
that would otherwise be handled by multiple Oracle programs running for each
user. The background processes perform input/output (I/O) and monitor other
Oracle processes to provide increase parallelism for better performance and
reliability.
3. Oracle Database:- An Oracle database consists of operating system files, also
known as database files, that provide the actual physical storage for database
information. The database files are used to ensure that the data is kept consistent
and can be recovered in the event of a failure of the instance.
4. Other Key Files:- Non-database files are used to configure the instance,
authenticate privileged users, and recover the database in the event of a disk
failure.
5. User and Server Processes:- The user server processes are the primary processes
involved when a SQL statement is executed; however, other processes may help
the server complete the processing of the SQL statement.
6. Other Processes:- Many other processes exist that are used options within
Oracle, such as Advanced Queuing, Real Application Clusters, Shared Server,
Advanced Replication, and so on.

3.2 ORACLE SERVER

38
An Oracle Server:

 Is a database management system that provides an open, comprehensive, integrated


approach to information management.
 Consists of an Oracle Instance and an Oracle Database.

The Oracle server run on a different computers in one of the following ways:
1. Client-Application Server-server
2. Client-Server
3. Host-Based

39
1. Client-Application Server-Server:- (Three-Tier) Users access the database from
their personal computers (client) through an application server, which is used for
the application’s requirements.
2. Client-Server:- (Two-Tier) Users access the database from their personal
computer (client) over a network, and the database sits on a seprate computer
(server).
3. Host-Based:- Users are connected directly to the same computer on which the
database resides.

3.3 ORACLE INSTANCE

An Oracle Instance:

 Is a means to access an oracle database


 Always opens one and only one database
 Consists of memory and background process structures

40
An Oracle instance consists of the System Global Area (SGA) memory structure and the
background processes used to manage a database. An instance is identified by using methods
specific to each operation system. The instance can open and use only one database at a time.

3.4 ESTABLISING A CONNECTION AND CREATING A SESSION

Connection to an Oracle Instance:

 Establishing a user connection


 Creating a session

41
Connection to an Oracle Instance

Before users can submit SQL statements to an Oracle database, they must connect to an instance.
 The user starts a tool such as SQL*Plus or runs an application developed using a tool
such as Oracle Forms. This application or tool is executed as a user process.
 In the most basic configuration, when a user logs on to the Oracle Server, a process is
created on the computer running the Oracle Server. This process is called a server
process. The server process communicates with the Oracle instance on the behalf of the
user process that runs on the client. The server process executes SQL statements on
behalf of the user.

Connection

A connection is a communication pathway between a user process and an Oracle Server. A


database user can connect to an Oracle Server in one of the three ways:

1. The user log on to the operating system running the Oracle instance and
starts an application or tool that accesses the database on that system. The
communication pathway is established using the interprocess
communication mechanisms available on the host operating system.
2. The user starts the application or tool on a local computer and connects
over a network to the computer running the Oracle instance. In this
configuration, called client-server, network software is used to
communicate between the user and the Oracle Server.
3. In a three-tier connection, the user’s computer communicates over the
network to an application or a network server, which is connected through
a network to the machine running the Oracle instance. For example, the
user runs a browser on a network to use an application residing on an NT
server that retrieves data from an Oracle database running on a UNIX
host.

42
Session

A session is a specific connection of a user to an Oracle server. The session starts when the user
is validated by the Oracle Server, and it ends when the user logs out or when there is an
abnormal termination. For a given database user, many concurrent sessions are possible if the
user logs on from many tools, applications, or terminals at the same time. Except for some
specialized database administration tools, starting a database session requires that the Oracle
Server be available for use.

3.5 ORACLE DATABASE

An Oracle database:
 Is a collection of data that is treated as a unit
 Consists of three files types

43
An Oracle Database

The general purpose of a database is to store and retrieve information. An Oracle database has a
logical and a physical structure. The physical structure of the database is the set of operating
system files in the database. An Oracle database consists of three file types:
1. Data files containing the actual data in the database.
2. Redo logs containing a record of changes made to the database to enable recovery
of the data in case of failures.
3. Control files containing information necessary to maintain and verify database
integrity.

Other Key File Structures

The Oracle Server also files that are not part of the database:
 The parameter file defines the characteristics of an Oracle instance. For example, it
contains parameters that size some of the memory structure in the SGA.
 The password file authenticates users privileged to starts up and down an Oracle instance.
 Archived redo log files are offline copies of the redo log files that may be necessary to
recover from media failures.

3.6 PHYSICAL STRUCTURE

The physical structure includes three types of files:


 Control files
 Data files
 Redo log files

Other keys files exits and are required to start up and use a database, for example: parameter
files, configuration files, password files and so on. However, the physical structure of an Oracle
database includes only three types of files: control files, data files, and redo log files.

44
3.7 MEMORY STRUCTURE

Oracle’s memory structure consists of two memory areas known as:


1. System Global Area (SGA): Allocated at instance startup, and is a fundamental
component of an Oracle instance
2. Program Global Area (PGA): Allocated when the server process is started

SGA is used to store database information that is shared by database processes. It contains data
and control information for the Oracle Server and is allocated in the virtual memory of the
computer where Oracle resides.

45
The following statement can be used to view SGA memory allocation
SHOW SGA;

Sizing the SGA

The size of SGA is determined by several initialization parameters. The parameters that most
affects SGA are:
DB_CACHE_SIZE: The size of the cache of standard blocks.
LOG_BUFFER: The number of bytes allocates for the redo log buffer cache.
SHARED_POOL_SIZE: The size in bytes of the area devoted to shared SQL and PL/SQL.
LARGE_POOL_SIZE: The size of the large pool; the default size is zero.

Dynamic SGA

A dynamic SGA implements an infrastructure that allows the SGA configuration to change
without shutting down the instance. This then allows the size of the database buffer cache, shared
pool, and large pool to changes without shutting down the instance. Conceivably, the database
buffer cache, shared pool, and large pool could be initially under configured and would grow and
shrink depending upon their respective workloads, up to a maximum of
SGA_MAX_SIZE.

3.7.1 System Global Area (SGA)

The SGA consists of several memory structures:

 Shared pool
 Database Buffer Cache
 Redo Log buffer
 Other structures (for example, lock and latch management, statistical data)

46
There are two additional memory structures that can be configured within the SGA:

 Large Pool
 Java Pool

 SGA is dynamic
 Sized by the SGA_MAX_SIZE parameter
 Allocated and tracked in granules by SGA components
 Contiguous virtual memory allocation
 Granules size based on total estimated SGA_MAX_SIZE

Unit of Allocation

A granule is a unit of contiguous virtual memory allocation. The size of a granule depends on the
estimated total SGA size whose calculation is based on the value of the parameter
SGA_MAX_SIZE
- 4 MB if estimated SGA size is < 128 MB
- 16 MB otherwise

3.7.2 SHARED POOL

 Used to store:
 Most recently executed SQL statements
 Most recently used data definitions

 It consists of two key performance-related memory structures:


 Library Cache
 Data Dictionary Cache

47
Sized by the parameter
SHARED_POOL_SIZE

The shared pool environment contains both fixed and variable structure. The fixed structures
remain relatively the same size, whereas the variable structures grow and shrink based on user
and program requirements. The actual sizing for the fixed and variable structures is based on an
initialization parameter and the work of an Oracle internal algorithm.

Sizing the Shared Pool

Since the shared pool is used for objects that can be shared globally, such as reusable SQL
execution plans; PL/SQL packages, procedures, and functions; and cursor information, it must be
sized to accommodate the needs of both the fixed and variable areas. Memory allocation for the
shared pool is determined by the SHARED_POOL_SIZE initialization parameter. It can be
dynamically resized using ALTER SYSTEM SET. After performance analysis, this can be
adjusted but the total SGA size cannot exceed SGA_MAX_SIZE.

3.7.3 LIBRARY CACHE


 Stores information about the most recently used SQL and PL/SQL statements
 Enables the sharing of commonly used statements
 Is managed by a least recently used (LRU) algorithm

48
 Consists of two structures:
 Shared SQL area
 Shared PL/SQL area
 Size determine by the Shared Pool sizing.

The library cache size is based on the sizing defined for the shared pool. Memory is allocated
when a statement is parsed or a program unit is called. If the size of the shared pool is too small,
statements are continually reloaded into the library cache, which affects performance. The library
cache is managed by a least recently used (LRU) algorithm. As the cache fills, less recently used
execution paths and parse trees are removed from the library cache to make room for the new
entries. If the SQL or PL/SQL statements are not reused, they eventually are aged out.
The library cache consists of two structures:

 Shared SQL:- The shared SQL stores and shares the execution plan and parse tree for
SQL statements run against the database. The second time that an identical SQL
statement is run; it is able to take advantage of the parse information available in the
shared SQL to expedite its execution. To ensure that SQL statements use a shared SQL
area whenever possible, the text, schema, and bind variables must be exactly the same.
 Shared PL/SQL:- The shared PL/SQL area stores and shares the most recently executed
PL/SQL statements. Parsed and compiled program units and procedures are stored in this
area.

3.7.4 DATA DICTIONARY CACHE

 A collection of the most recently used definitions in the database


 Includes information about database files, tables, indexes, columns, users, privileges, and
other database objects
 During the parse phase, the server process look at the data dictionary for information to
resolve object names and validate access
 Caching data dictionary information into memory improves response time on queries and
DML

49
 Size determine by the Shared Pool sizing.

The data dictionary cache is also referred to as the dictionary cache or row cache, caching data
dictionary information into memory improves response time. Information about the database is
stored in the data dictionary tables. When this information is needed by the database, the data
dictionary tables are read, and the data that is returned is stored in the data dictionary cache.

Sizing the Data Dictionary

The overall size is dependent on the size of the shared pool size and is internally by the database.
If the data dictionary cache is too small, then the database has to query the data dictionary tables
repeatedly for information needed by the database. These queries are called recursive calls and
are slower than the queries that are handled by the data dictionary cache.

3.7.5 DATABASE BUFFER CACHE

 Stores copies of data blocks that have been retrieved from the data files
 Enables great performance gains when you obtain and update data
 Managed through an LRU algorithm
 DB_BLOCK_SIZE determine primary block size.

When a query is processed, the Oracle server process looks in the database buffer cache for any
blocks it needs. If the block is not found in the database buffer cache, the server process reads the
block from the data files and places a copy in the database buffer cache. Because subsequent
require physical reads. The Oracle server uses a least recently used algorithm to age out buffers
that have not been accessed recently to make room for new blocks in the database buffer cache.

50
 Database Buffer Cache consists of independent sub-caches
 DB_CACHE_SIZE
 DB_KEEP_CACHE_SIZE
 DB_RECYCLE_CACHE_SIZE
 Can alter dynamically resized

ALTER SYSTEM SET DB_CACHE_SIZE=96M;

 DB_CACHE_ADVICE set to gather statistics for predicting different cache size


behavior
 Statistics displayed by V$DB_CACHE_ADVICE

Sizing the Buffer Cache

The size of buffer cache is equal to the size of an Oracle block, and it is specified by the
DB_BLOCK_SIZE parameter. The database buffer cache consists of independent sub-caches for
buffer pools and for multiple block sizes. The parameter DB_BLOCK_SIZE determines the
primary block size, which is used for the SYSTEM tablespace.
Three parameters define the sizes of the buffer caches:

51
1. DB_BLOCK_SIZE:- Sizes the default buffer cache size only, it always exists and
cannot be set to zero.
2. DB_KEEP_CACHE:- Sizes the keep buffer cache, which s used to retain blocks in
memory that are likely to be reused.
3. DB_RECYCLE_CACHE_SIZE:- Sizes the recycle buffer cache, which is used to
eliminate blocks from memory that have little change of being reused.

Multiple Block Sizes

An Oracle database can be created with a standard block size and up to four non-standard block
sizes. Non-standard block sizes can have any power-of-two values between 2 KB and 32 KB.

Data Buffer Advisory Parameter

The buffer cache advisory feature enables and disables statistics gathering for predicting
behavior with different cache sizes. The information provided by these statistics can help BDA
size the buffer cache optimally for a given workload. The buffer cache advisory information is
collected and displayed through the V$DB_CACHE_ADVICE view.
The buffer cache advisory is enabled via the initialization parameter DB_CACHE_ADVICE.
It is a dynamic parameter via ALTER SYSTEM. Three values ( OFF, ON, READY) are
available.

DB_CACHE_ADVICE Parameter Values

OFF:- Advisory is turned off and the memory for the advisory is not allocated
ON:- Advisory is turned on and both cpu and memory overhead is incurred.
READY:- Advisory is turned off but the memory for the advisory remains allocated.

52
3.7.6 REDO LOG BUFFER
 Records all changes made to the database blocks
 Primary purpose is recovery
 Changes recorded within are called redo entries
 Redo entries contain information to reconstruct or redo changes
 Size defined by LOG_BUFFER

The redo log buffer cache is a circular buffer that contains changes made to data files blocks.
This information is stored in redo entries. Redo entries contain the information necessary to
recreate the data prior to the change made by INSERT, UPDATE, CREATE, ALTER, or DROP
operation.

Sizing the Redo Log Buffer Cache

The size of the redo log buffer cache is defined by the initialization parameter LOG_BUFFER.

53
3.7.7 LARGE POOL
 An optional area of memory in the SGA
 Relieves the burden placed on the Shared Pool
 Used for:
 Session memory (UGA) for the Shared Server
 I/O server processes
 Backup and restore operations or RMAN
 Parallel execution message buffers
 PARALLEL_AUTOMATIC_TUNING set to TRUE
 Does not use an LRU list
 Sized by LARGE_POOL_SIZE

When users connect through the shared server, Oracle needs to allocate additional space in the
shared pool for storing information about the connections between the user processes, dispatches,
and servers. The large pool relieves the burden on areas within the shared pool. The shared pool
does not have to give up memory for caching SQL parse trees in favor f shared server session
information, I/O, and backup and recovery processes. The performance gain is from the
reduction of overhead from increasing and shrinking of the shared SQL cache.

Backup and Restore:- Recovery Manager (RMAN) uses the large pool when the
BACKUP_DISK_IO=n and BACKUP_TAPE_IO_SLAVE=TRUE parameters are set. If the
large pool is configured but is not large enough, the allocation of memory from the large pool
fails. RMAN writes an error message to the alert log file and does not use I/O slaves for backup
or restore.

Sizing the Large Pool

The large pool is sized in bytes defined by the LARGE_POOL_SIZE parameter.

54
3.7.8 JAVA POOL

 Services parsing requirements for JAVA commands


 Required if installing and using JAVA
 Sized by JAVA_POOL_SIZE parameter

The Java pool is an optional setting but is required if installing and using Java. Its size is set, in
bytes, using the JAVA_POOL_SIZE parameter.

3.7.9 PROGRAM GLOBAL AREA (PGA)

 Memory reserved for each user process connecting to an Oracle database


 Allocated when a process is created
 De-allocated when the process is terminated
 Used by only one process

55
The Program Global Area or Process Global Area (PGA) is a memory region that
contains data and control information for a single server process or a single background
process. The PGA is allocated when a process is created and de-allocated when the
process is terminated. In contrast to the SGA, which is shared by several processes, the
PGA is an area that is used by only one process. In dedicated server configuration, the
PGA included these components:
 Sort Area: - Used for any sorts that may be required to process the SQL
statement.
 Session Information: - Includes user privileges and performance statistics for the
session.
 Cursor State: - Indicates the stage in the processing of the SQL statements that
are currently used by the session.
 Stack Space: - Contains other session variables.

3.7.10 PROCESS STRUCTURE

Oracle takes advantage of various types of processes:


 USER PROCESS: Started at the time a database user requests connection to the Oracle
server
 SERVER PROCESS: Connects to the Oracle Instance and is started when a user
establishes a session
 BACKGROUND PROCESSES: Started when as Oracle Instance is started

3.7.10.1 USER PROCESS


 A program that requests interaction with the Oracle server
 Must first establish a connection
 Does not interact directly with the Oracle server

56
A database user who needs to request information from the database must first make a
connection with the Oracle server. The connection is requested using a database interface tool,
such as SQL*PLUS, and beginning the user process. The user process does not interact directly
with the Oracle server. Rather it generates call through the user program interface (UPI), which
creates a session and starts a server process.

3.7.10.2 SERVER PROCESS


 A program that directly interacts with the Oracle server
 Fulfills calls generated and returns results
 Can be dedicated or Shared Server

57
Once a user has established a connection, a server process is started to handle the user processes
request. A server process can be either a dedicated server process or a shared server process. In a
dedicated server environment, the server process is terminated. In a shared server process
communication with the Oracle server using the Oracle Program Interface (OPI).

3.7.11 BACKGROUND PROCESSES

Maintains and enforces relationships between physical and memory structures


 Mandatory background process:
 DBWn
 PMON
 CKPT
 LGWR
 SMON
 Optional background processes:
 ARCn

58
 LMDn
 RECO
 CJQ0
 LMON
 Snnn
 Dnnn
 Pnnn
 LCKn
 QMNn

3.7.11.1 DATABASE WRITER (DBWn)

DBWn writes when:


 Check point occurs
 Dirty buffers reach threshold
 There are no free buffers
 Timeout occurs
 RAC ping request is made
 Tablespace OFFLINE
 Tablespace READ ONLY
 Table DROP or TREUNCATE
 Tablespace BEGIN BACKUP

The server process records changes to rollback and data blocks in the buffer cache. Database
Writer (DBWn) writes the dirty buffers from the database buffer cache to the data files. It
ensures that a sufficient number of free buffers-buffers that can be overwritten when server
processes need to read in blocks from the data files-are available in the database buffer cache.
Database performance is improved because server processes make changes only in the buffer
cache.

59
3.7.11.2 LOG WRITER (LGWR)

LGWR writes:
 At commit
 When one-third full
 When there is 1 MB of redo
 Every three seconds
 Before DBWn writes

Because the redo is needed for recovery, LGWR confirms the commit only after the redo is
written to disk.
LGWR can also call on DBWn to write to the data files.

60
3.7.11.3 SYSTEM MONITOR (SMON)

Responsibilities:
 Instance recovery
 Rolls forward changes in redo logs
 Opens database for user access
 Rolls back uncommitted transactions
 Coalesce free space
 De-allocates temporary segments

If the Oracle instance fails, any information in the SGA that has not been written to disk is lost.
For example, the failure of the operating system causes an instance failure. After the loss of the
instance, the background process SMON automatically performs instance recovery when the
database is reopened. Instance recovery consists of the following steps:
1. Rolling forward to recover data that has not been recorded in the files but that has been
recorded in the online redo log. This data has been written to disk because of the loss of

61
the SGA during instance failure. During this process, SMON reads the redo log files and
applies the changes recorded in the redo log to the data blocks. Because all committed
transaction have been written to the redo logs, this process completely recovers these
transactions.
2. Opening the database so that users can log on. Any data that is not locked by unrecorded
transactions is immediately available.
3. Rolling back uncommitted transactions. They are rolled back by SMON or by the
individual server processes as they access locked data.

3.7.11.4 PROCESS MONITOR (PMON)

Cleans up after failed processes by:


 Rolling back the transaction
 Releasing locks
 Releasing other resources
 Restarting dead dispatchers

62
3.7.11.5 CHECKPOINT (CKPT)

Responsible for:
 Signaling DBWn at checkpoints
 Updating data files headers with checkpoint information
 Updating control files with checkpoint information

63
An event is called a checkpoint occurs when the Oracle background process DBWn writes all the
modified database buffers in the SGA, including both committed and uncommitted data, to the
data files.
Checkpoints are implemented for the following reasons:
 Checkpoints ensure that data blocks in memory that changes frequently are written to
data files regularly. Because of the least recently used algorithm of DBWn, a data block
that changes frequently might never qualify as the least recently used block and thus
might never be written to disk if checkpoints did not occur.
 Because all database changes up to the checkpoint have been recorded in the data files,
redo log entries before the checkpoint no longer need to be applied to the data files if
instance recovery is required. Therefore, checkpoints are useful because they can
expedite instance recovery.

At a checkpoint, the following information is written:


 Checkpoint number into the data file headers
 Checkpoint number, log sequence number, archive log name, and system change numner
into the control files.

CKPT does not write data blocks to disk or redo blocks to the online redo logs.

3.7.11.6 ARCHIVER (ARCn)

 Optional background process


 Automatically archives online redo logs when ARCHIVELOG mode is set
 Preserves the record of all changes made to the database

64
The Archive Process

All other background processes are optional, depending on the configuration of the database;
however, one of them, ARCn, is crucial to recovering a database after the loss of a disk. As
online redo log files fill, the Oracle server begins writing to the next online redo log file. The
process of switching from one redo log to another is called a log switch. The ARCn process
initiates backing up, or archiving, of the filled log group at every log switch. It automatically
archives the online redo log before the log can be reused, so that all of the changes made to the
database are preserved. This enables the DBA to recover the database to the point of failure, even
if a disk drive is damaged.

Archiving Redo Log Files

One of the important decisions that a DBA has to make is whether to configure the database to
operate in ARCHIVELOG or in NOARCHIVELOG mode.

NOARCHIVELOG MODE :- In NONARCHIVELOG mode, the online redo log files are
overwritten each time a log switch occurs. LGWR does not overwrite a redo log group until the
checkpoint for that group is complete. This ensures that committed data can be recovered if there
is an instance crash. During the instance crash, only the SGA is lost. There is no loss of disks,
only memory. For example, an operating system crash causes an instance crash.
ARCHIVELOG MODE:- If the database is configured to run in ARCHIVELOG mode,
inactive group of filled online redo log files must be archived before they can be used again.
Since changes made to the database are recovered in the online redo log files, the database
administrator can use the physical backup of the data files and the archived online redo log files
to recover the database without losing any committed data because of any single point of failure,
including the loss of a disk. Usually, a production database is configured to run in
ARCHIVELOG mode.

65
3.7.12 LOGICAL STRUCTURE

 Dictates how the physical space of a database is used


 Hierarchy consisting of tablespaces, segments, extents, and blocks

A logical structure hierarchy exists as follows:


 An Oracle database is a group of tablespaces.
 A tablespace may consist of one or more segments.
 A segment is made up of extents.
 An extent is made up of logical blocks.
 Blocks is the smallest unit for read and write operation.

3.7.13 PROCESSING SQL STATEMENTS

 Connect to an instance using:


 User process
 Server process

66
 The Oracle server components that are used depends on the type of SQL statements:
 Queries returns rows
 DML statements log changes
 Commit ensures transaction recovery
 Some Oracle server components do not participate in SQL statements processing.

A data manipulation language (DML) statement requires only two phases of processing,
1. Parse is the same as the parse phase used for processing a query.
2. Execute requires additional processing to make data changes.

DML Execute Phase

To execute a DML statement:


 If the data and rollback blocks are not already in the buffer cache, the server process
reads them from the data files into the buffer cache.
 The server process place locks on the rows that are to be modified. The rollback block is
used to store the before-image of the data, so that the DML statements can be rolled back
if necessary.
 The data block records the new values of the data.

Processing a DML Statement

The server process records the before-image to the undo block and updates the data block.
Both of these changes are done in the database buffer cache. Any changed blocks in the buffer
cache are marked as dirty buffers. That is, buffers that are not same as the corresponding block
on the disk.

The processing of a DELETE or INSERT command uses similar steps. The before-image for a
DELETE contains the column values in the deleted rows, and the before image of an INSERT
contains the rows location information.

67
3.7.14 DATABASE CREATION

Before creating the database we have to define some parameters according to computer
configuration so that performance will increase and less I/O will use. If we can’t configure
parameters than Oracle will automatically take the default values.

Db_name=lalit
Db_block_size=8192
Control_files=’G:\oracle\lalit\control.ctl’
Shared_pool_size=200m
Sga_target=300m
Undo_management=auto
Undo_tablespace=undo1

In this, location is given as per the user need. User can choose any of the location to store his
parameter files. If he can’t choose any location than parameter file will store on default location
in DATABASE folder in Oracle. Parameter file save with “initdatabe_name.ora” extension.
When we start creating database first of all control files will read ad than database will created.

68
Along with this our parameter files will generates whose non-default values we already sets.

69
But some files in the Oracle a DBA can’t change so he has to create a view of that tables.
@?/rdbms/admin/catalog.sql

But with the above command view is created but we can’t read from that view because all the
data is in binary form so DBA has to use the command after this that will decode all the view in
user understandable from.
@?/rdbms/admin/catproc.sql

70
CHAPTER-4: INTRODUCTION TO DATABASE
MIGRATION

71
INTRODUCTION TO DATA MIGRATION

DATA MIGRATION

Data Migration is the process of transferring data between storage types, formats, or computer
systems. Data migration is usually performed programmatically to achieve an automated
migration, freeing up human resources from tedious tasks. It is required when organizations or
individuals change computer systems or upgrade to new systems, or when systems merge (such
as when the organizations that use them undergo a merger/takeover).To achieve an effective data
migration procedure, data on the old system is mapped to the new system providing a design for
data extraction and data loading. The design relates old data formats to the new system's formats
and requirements. Programmatic data migration may involve many phases but it minimally
includes data extraction where data is read from the old system and data loading where data is
written to the new system. If a decision has been made to provide a set input file specification for
loading data onto the target system, this allows a pre-load 'data validation' step to be put in place,
interrupting the standard E(T)L process. Such a data validation process can be designed to
interrogate the data to be migrated, to ensure that it meets the pre-defined criteria of the target
environment, and the input file specification. An alternative strategy is to have on-the fly data
validation occurring at the point of loading, which can be designed to report on load rejection
errors as the load progresses. However, in the event that the extracted and transformed data
elements are highly 'integrated' with one another, and the presence of all extracted data in the
target system is essential to system functionality, this strategy can have detrimental, and not
easily quantifiable effects. After loading into the new system, results are subjected to data
verification to determine whether data was accurately translated, is complete, and supports
processes in the new system. During verification, there may be a need for a parallel run of both
systems to identify areas of disparity and forestall erroneous loss. Automated and manual data
cleaning is commonly performed in migration to improve data quality, eliminate redundant or
obsolete information, and match the requirements of the new system. Data migration phases
(design, extraction, cleansing, load, verification) for applications of moderate to high complexity
are commonly repeated several times before the new system is deployed.

72
4.1 DATA MIGRATION CATEGORIES

Data is stored on various media in files or databases, is generated and consumed by software
applications which in turn support business processes. The need to migrate data can be driven by
multiple business requirements and the approach taken to the migration depends on those
requirements. Three major migration categories are proposed on this basis.

STORAGE MIGRATION

A business may choose to rationalize the physical media to take advantage of more efficient
storage technologies. This will result in having to move physical blocks of data from one tape or
disk to another, often using virtualization techniques. The data format and content itself will not
usually be changed in the process and can normally be achieved with minimal or no impact to
the layers above.

DATABASE MIGRATION

Similarly, it may be necessary to move from one database vendor to another, or to upgrade the
version of database software being used. The latter case is less likely to require a physical data
migration, but this can happen with major upgrades. In these cases a physical transformation
process may be required since the underlying data format can change significantly.

APPLICATION MIGRATION

Changing application vendor - for instance a new CRM or ERP platform will inevitably involve
substantial transformation as almost every application or suite operates on its own specific data
model. Further, to allow the application to be sold to the widest possible market, commercial off-
the-shelf packages are generally configured for each customer using metadata. Application
programming interfaces (APIs) are supplied to protect the integrity of the data they have to

73
handle. Use of the API is normally a condition of the software warranty, although a waiver may
be allowed if the vendor's own or certified partner professional services and tools are used.

BUSINESS PROCESS MIGRATION

Business processes operate through a combination of human and application systems actions,
often orchestrated by business process management tools. When these change they can require
the movement of data from one store, database or application to another to reflect the changes to
the organization and information about customers, products and operations. Examples of such
migration drivers are Mergers and acquisitions, business optimization and reorganization to
attack new markets or respond to competitive threat.

4.2 DATABASE MIGRATION

Like data migration it may be necessary to move from one database vendor to another, or to
upgrade the version of database software being used. The latter case is less likely to require a
physical data migration, but this can happen with major upgrades. In these cases a physical
transformation process may be required since the underlying data format can change
significantly. This may or may not affect behavior in the applications layer, depending largely on
whether the data manipulation language or protocol has changed - but modern applications are
written to be agnostic to the database technology so that a change from Oracle to MySQL, DB2
or SQL Server should only require a testing cycle to be confident that both functional and non-
functional performance has not been adversely affected. Carefully examine the database and the
data. Determine what you are trying to accomplish with the database. Only then can you plan
what the new database environment should look like. Database migration is necessary when a
company upgrades its database or system software, either from one version to another or from
one program to an entirely different program. Software can be specifically written, using entire
programs or just scripts, to facilitate data migration. Such upgrades or program switches can take

74
place as a result of regular company practices or as a result of directives mandated in the wake of
a company takeover.

4.3 WINDOWS TO LINUX MIGRATION GUIDE

For many companies, migrating from Windows to Linux makes sense. The reasons are
compelling: greater stability and reliability, lower cost, access to application source code, greater
security, and conformity with open standards, according to numerous independent studies and
industry experts. But for those charged with spearheading the actual migration, the project may
seem difficult, unwieldy, or untenable. They may fear unanticipated technological roadblocks as
well as employees reticent to give up the tools they are used to using. Before launching into a
migration, make sure your reasons for migrating are business-driven. "You'll fail if you start a
migration from the ideological side instead," notes Stefan Werden, leader of the Linux
architecture team for Novell SUSE for Europe, the Middle East, and Africa. To make that
decision, conduct a thorough cost-benefit analysis, projected out several years, based on what it
would take to migrate and what it would cost not to migrate--not only financial costs, but other
costs, such as the amount of overall control you want over your organization's software
requirements. With a Windows-to-Linux migration, "nobody can force you to upgrade or deny
you a security fix. You always call the shots, so you are guaranteed to save money and retain
more control in the long run," says Nicholas Petreley, a Linux analyst at IT market intelligence
firm Evans Data Corp. of Santa Cruz, Calif. Of course, reducing costs can be another driving
factor in the decision-making process. According to Forrester Research, 68% of companies in
North America consider lowering overall operating costs a top priority, and Linux is one means
to that end. To achieve success in a Windows to Linux desktop migration, experts advise
planning, planning, and more planning. Before starting, take these STEPS:

GET EXECUTIVE BUY-IN- This step is crucial. Without executive support, your project may
be doomed to failure. By gathering this support before launching your project, you can be sure
will be supported, both financially and otherwise.

75
EXAMINE EXISTING ENVIRONMENT-Also; consider whether it makes sense to invest in
new hardware, or whether you will be migrating with existing hardware.

CHOOSE LINUX VERSION - It doesn't matter much whether you settle on Red Hat, SUSE,
Mandrake, Gentoo, Debian, or another flavor altogether. It's more important that your migration
team is comfortable with the Linux distribution chosen.

IDENTIFY THE EXPERTS -Often, these include transactional workers and developers –
employees who rely on basic applications like Web browsers, productivity suites, and email –
and not workers who routinely use powerful programs like Photoshop or Macintosh-based
applications. "It's about identifying needs or pockets where you can save the greatest amount of
money with minimal disruption," notes Susan Jeffries, marketing manager for the Java Desktop
System at Sun Microsystems Inc.

CONSIDER USING OUTSIDE CONSULTANTS- Unless you or members of your team have
attempted such a migration before, it can be well worth the money to hire an outside consultant.
Doing so can save countless hours of frustration. "When you run into a problem like not being
able to get something to print, someone familiar with Linux will be able to fix it in 10 minutes. If
you don't have such a person, it could take a week to fix the problem," Petreley says. Consultants
charge in various ways--by the hour, or by the project. A reasonable hourly price range for
general consulting with an experienced systems engineer costs $125 to $225 per hour. Total cost
depends on geographic location, the cost of consultants in the area, and the number of
employees, environment and number of applications to migrate. Typically, a migration
consulting service may begin with a workshop, proof of concept and architecture and
implementation review.

Here are some tips you can use to enhance your Windows to Linux migration experience,
along with solutions to some daunting challenges.

76
CHOOSING A LINUX DISTRIBUTION

When it comes to choosing a Linux distribution, there are a dizzying array of choices, including
de facto standard Red Hat, Mandrake, SUSE, Mandrake, a Debian variant, or Fedora, a Red Hat-
and community-supported open source project to build a complete, general purpose operating
system exclusively from free software. Businesses usually can't lose by going with a de facto
standard, which today is SUSE, Red Hat, or Fedora. SUSE is backed by Novell, while the Java
Desktop System is supported by Sun. Both are solid companies, and both distributions also are
solid choices. SUSE can make a great deal of sense for companies used to managing their
desktop environment with systems management tools like ZenWorks and SMS. But for
businesses that have deployed Red Hat servers and have experience with the Red Hat network,
extending their IT infrastructure with Red Hat desktops also can make good sense.

CHOOSING A WINDOW MANAGER

Unlike Windows, Linux supports GNOME and KDE, as well as a wide variety of window
managers. Red Hat makes GNOME and KDE look and work alike. While GNOME has the
fewest features and is least likely to confuse users with options, KDE is more feature-rich and
powerful, but is so configurable that it could some to spend too much time working with it.

DATABASE MIGRATION

Carefully examine the database and the data. Determine what you are trying to accomplish with
the database. Only then can you plan what the new database environment should look like.
Microsoft Access databases can be ported fairly easily to MySQL Database Server and
PostgreSQL. In situations where migration problems occur, some enterprising vendors are
creating solutions, such as Versora's ProgressionDB, which deals with some of the most common
conversion issues. Another option is Shareplex from Quest Software Inc. of Irvine, Calif. This

77
data replication software, mostly used to migrate Oracle to Linux, provides support for thousands
of rows per second for load distribution, disaster recovery, and migrations.

4.4 ORACLE RECOVERY MANAGER (RMAN)

Oracle Recovery Manager satisfies the most pressing demands of per formant, manageable
backup and recovery, for all Oracle data formats. A complete high availability and disaster
recovery strategy requires dependable data backup, restore, and recovery procedures. Oracle
Recovery Manager (RMAN) provides a comprehensive foundation for efficiently backing up and
recovering the Oracle database. It is designed to work intimately with the server, providing
block-level corruption detection during backup and restore. RMAN optimizes performance and
space consumption during backup with file multiplexing and backup set compression, and
integrates with Oracle Secure Backup, as well as third party media management products, for
tape backup. RMAN takes care of all underlying database procedures before and after backup or
restore, freeing dependency on OS and SQL*Plus scripts. It provides a common interface, via
command line and Enterprise Manager, for backup tasks across different host operating systems
and offers features not available through user-managed methods, such as parallelization of
backup/restore data streams, backup files retention policy, and detailed history of all backups.

Database Migration from Windows to Linux Using RMAN

This post describes the procedure required to migrate a database from Windows to Linux using
the RMAN Convert Database command. Both Windows and Linux platforms have the same
endian format, which makes possible to transfer the whole database, making the migration
process very straightforward and simple. To migrate between platforms that have a different
endian format, Cross Platform Transportable Table spaces (XTTS) needs to be used instead.

78
4.5 LIST OF STEPS NEEDED TO COMPLETE THE MIGRATION

1. Check platform compatibility between source and target OS

2. Start the database in read only mode

3. Check database readiness for transport from Windows to Linux using


DBMS_TDB.CHECK_DB

4. Check if there are any external objects

5. Execute the Rman Convert database command

6. Copy converted datafiles, generated Transport Script and Parameter File to Linux

7. Edit the init.ora for the new database

8. Edit the Transport Script and Parameter File changing the windows paths to Linux Paths

9. Execute the Transport Script

10. Change the Database ID

11. Check database integrity

4.6 COMMAND NOTATION

desc V$db_transportable_platform
Step 1-->shutdown database
Step 2-->startup mount
Alter database open read only

Step 3-->connect to rman and run a script

79
rman> convert database new database vikas1
transport script 'E:\trans'
to paltform 'Linux IA(32-bit)';

//this scripr generate file for every datafile in database folder and parameter file and
a transport script in e drive copy these files and paste on linux machine.

Step 4-->copy parameter file on default location and edit it;


Step 5-->edit tranport script file (change new path).
Step 6-->open sqlplus and run script transport script.

4.7 SYSTEM REQUIREMENTS

HARDWARE REQUIREMENT

RAM—512 Mb (DDR2) or Higher


CPU—2.86 GHz or higher
MOTHER BOARD -- D945 GNT or Higher version
HARD DISK— 10 Gb free or per requirement
KEYBOARD-- TVS GOLD
MOUSE—Optical Mouse
MONITOR—as per requirement

SOFTWARE REQUIREMENT

OPERATING SYSTEM— Windows 98, 2000, NT, XP. (32-bit), LINUX red hat 5.
DATABASE -- Oracle 9i, 10g
TOOL—Oracle Recovery Manager
FRONT END—SQL Navigator
BACK END – Oracle 10g

80
CHAPTER-5: DATA DESIGN

81
DATA DESIGN

5.1 DATA FLOW DIAGRAM

A data-flow diagram (DFD) is a graphical representation of the "flow" of data through an


information system. DFDs can also be used for the visualization of data processing (structured
design).On a DFD, data items flow from an external data source or an internal data store to an
internal data store or an external data sink, via an internal process. A DFD provides no
information about the timing of processes, or about whether processes will operate in sequence
or in parallel. It is therefore quite different from a flowchart, which shows the flow of control
through an algorithm, allowing a reader to determine what operations will be performed, in what
order, and under what circumstances, but not what kinds of data will be input to and output from
the system, nor where the data will come from and go to, nor where the data will be stored (all of
which are shown on a DFD). Data-flow diagrams (DFDs) are one of the three essential
perspectives of the structured-systems analysis and design method SSADM. The sponsor of a
project and the end users will need to be briefed and consulted throughout all stages of a system's
evolution. With a data-flow diagram, users are able to visualize how the system will operate,
what the system will accomplish, and how the system will be implemented. The old system's
dataflow diagrams can be drawn up and compared with the new system's data-flow diagrams to
draw comparisons to implement a more efficient system. Data-flow diagrams can be used to
provide the end user with a physical idea of where the data they input ultimately has an effect
upon the structure of the whole system from order to dispatch to report. How any system is
developed can be determined through a data-flow diagram.

DEVELOPING A DATA-FLOW DIAGRAM

In the course of developing a set of leveled data-flow diagrams the analyst/designers is forced to
address how the system may be decomposed into component sub-systems, and to identify the
transaction data in the data model. There are different notations to draw data-flow diagrams,
defining different visual representations for Processes, Data Stores, Data Flow, And External

82
Entities. Data flow diagram ("bubble charts") are directed graphs in which the nodes specify
processing activities and the arcs specify data items transmitted between processing nodes.

TOP-DOWN APPROACH

1. The system designer makes "a context level DFD" or Level 0, which shows the
"interaction" (data flows) between "the system" (represented by one process) and "the
system environment" (represented by terminators).
2. The system is "decomposed in lower-level DFD (Level 1)" into a set of "processes, data
stores, and the data flows between these processes and data stores".

83
3. Each process is then decomposed into an "even-lower-level diagram containing its sub
processes".
4. This approach "then continues on the subsequent sub processes", until a necessary and
sufficient level of detail is reached which is called the primitive process (aka chewable in
one bite).

DFD is also a virtually designable diagram that technically or diagrammatically describes the
inflow and outflow of data or information that is provided by the external entity.

 In Level0 the diagram does not contain any Data stores.

EVENT PARTITIONING APPROACH

This level shows the overall context of the system and its operating environment and shows the
whole system as just one process. It does not usually show data stores, unless they are "owned"
by external systems, e.g. are accessed by but not maintained by this system, however, these are
often shown as external entities.

84
LEVEL 1 (HIGH LEVEL DIAGRAM)

A Level 1 Data flow diagram for the same system. This level (level 1) shows all processes at the
first level of numbering, data stores, external entities and the data flows between them. The
purpose of this level is to show the major high-level processes of the system and their
interrelation. A process model will have one, and only one, level-1 diagram. A level-1 diagram
must be balanced with its parent context level diagram, i.e. there must be the same external
entities and the same data flows, these can be broken down to more detail in the level 1, example
the "inquiry" data flow could be split into "inquiry request" and "inquiry results" and still be
valid.

85
LEVEL 2 (LOW LEVEL DIAGRAM)

A Level 2 Data flow diagram showing the "Process Enquiry" process for the same system. This
level is decomposition of a process shown in a level-1 diagram, as such there should be a level-2
diagram for each and every process shown in a level-1 diagram. In this example processes 1.1,
1.2 & 1.3 are all children of process 1, together they wholly and completely describe process 1,
and combined must perform the full capacity of this parent process. As before, a level-2 diagram
must be balanced with its parent level-1 diagram test.

5.2 ENTITY-RELATIONSHIP MODEL

In software engineering, an entity-relationship model (ERM) is an abstract and conceptual


representation of data. Entity-relationship modeling is a database modeling method, used to
produce a type of conceptual schema or semantic data model of a system, often a relational
database, and its requirements in a top-down fashion. Diagrams created by this process are called
entity-relationship diagrams, ER diagrams, or ERDs. The first stage of information system
design uses these models during the requirements analysis to describe information needs or the
type of information that is to be stored in a database. The data modeling technique can be used to
describe any ontology (i.e. an overview and classifications of used terms and their relationships)
for a certain area of interest. In the case of the design of an information system that is based on a

86
database, the conceptual data model is, at a later stage (usually called logical design), mapped to
a logical data model, such as the relational model; this in turn is mapped to a physical model
during physical design. Note that sometimes, both of these phases are referred to as "physical
design”. There are a number of conventions for entity-relationship diagrams (ERDs). The
classical notation mainly relates to conceptual modeling. There are a range of notations
employed in logical and physical database design, such as IDEF1X

THE BUILDING BLOCKS: ENTITIES, RELATIONSHIPS, AND ATTRIBUTES

Two related entities

An entity with an attribute

A relationship with an attribute

An entity may be defined as a thing which is recognized as being capable of an independent


existence and which can be uniquely identified. An entity is an abstraction from the complexities
of some domain. When we speak of an entity we normally speak of some aspect of the real world
which can be distinguished from other aspects of the real world. An entity may be a physical
object such as a house or a car, an event such as a house sale or a car service, or a concept such

87
as a customer transaction or order. Although the term entity is the one most commonly used,
following Chen we should really distinguish between an entity and an entity-type. An entity-type
is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually
many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most
people tend to use the term entity as a synonym for this term. A relationship captures how two or
more entities are related to one another. Relationships can be thought of as verbs, linking two or
more nouns. Examples: an owners relationship between a company and a computer, a supervises
relationship between an employee and a department, a performs relationship between an artist
and a song, a proved relationship between a mathematician and a theorem.

DIAGRAMMING CONVENTIONS

Entity sets are drawn as rectangles, relationship sets as diamonds. If an entity set participates in a
relationship set, they are connected with a line. Attributes are drawn as ovals and are connected
with a line to exactly one entity or relationship set. Cardinality constraints are expressed as
follows:

 a double line indicates a participation constraint, totality or surjectivity: all entities in the
entity set must participate in at least one relationship in the relationship set;
 an arrow from entity set to relationship set indicates a key constraint, i.e. infectivity: each
entity of the entity set can participate in at most one relationship in the relationship set;
 A thick line indicates both, i.e. objectivity: each entity in the entity set is involved in
exactly one relationship.
 An underlined name of an attribute indicates that it is a key: two different entities or
relationships with this attribute always have different values for this attribute.

Attributes are often omitted as they can clutter up a diagram; other diagram techniques often list
entity attributes within the rectangles drawn for entity sets

88
CHAPTER-6: CODING

89
The Windows Database

This exercise was done using a small default 10.2.0.4 database

SQL> select bytes, file_name from dba_data_files;


BYTES FILE_NAME
---------- ------------------------------------------------------------
440401920 G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSTEM_44QYBFPF_.DBF
246415360
G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_UNDOTBS1_44QYCFQW_.DBF
136314880 G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSAUX_44QYCZ3D_.DBF
5242880 G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_USERS_44QYDHY7_.DBF
SQL> select name from v$controlfile;
NAME
----------------------------------------------------------------------------------
G:\TECHNOLOGY\ORCLW\CONTROLFILE\O1_MF_44QY9SXR_.CTL
C:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCLW\CONTROLFILE\O1_
MF_44QY9TPX_.CTL
SQL> select member from v$logfile;
MEMBER
----------------------------------------------------------------------------------
G:\TECHNOLOGY\ORCLW\ONLINELOG\O1_MF_1_44QY9VJL_.LOG
C:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCLW\ONLINELOG\O1_MF
_1_44QY9ZZ7_.LOG
G:\TECHNOLOGY\ORCLW\ONLINELOG\O1_MF_2_44QYB14V_.LOG
C:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCLW\ONLINELOG\O1_MF
_2_44QYB5L1_.LOG
G:\TECHNOLOGY\ORCLW\ONLINELOG\O1_MF_3_44QYB6OY_.LOG

C:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORCLW\ONLINELOG\O1_MF
_3_44QYBC2F_.LOG

90
1. Check platform compatibility between sources and target OS

You need to check the platforms to be sure they have the same endian format; also you need to
save the platform_ name string to use it later as part of the convert database

SQL> select * from V$DB_TRANSPORTABLE_PLATFORM


2 where PLATFORM_NAME='Microsoft Windows IA (32-bit)' or
3 PLATFORM_NAME like 'Linux%'
4/
PLATFORM_ID PLATFORM_NAME ENDIAN_FORMAT
----------- --------------------------------------------- --------------
7 Microsoft Windows IA (32-bit) Little
10 Linux IA (32-bit) Little
11 Linux IA (64-bit) Little
13 Linux x86 64-bit Little

2. Start the database in read only mode

In order to execute dbms_tdb.check_db the database must be opened on read only mode.

SQL> shutdown immediate;


Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 167772160 bytes
Fixed Size 1295608 bytes
Variable Size 71305992 bytes

91
Database Buffers 88080384 bytes
Redo Buffers 7090176 bytes
Database mounted.
SQL> alter database open read only;
Database altered.

3. Check database readiness for transport from Windows to Linux

If the execution of dbms_tdb.check_db does not return any exceptions, that means the database is
ready for transport to the target platform.

SQL> set serveroutput on


SQL> declare
2 db_ready boolean;
3 begin
4 db_ready:= dbms_tdb.check_db('Linux IA (32-bit)');
5 end;
6/
PL/SQL procedure successfully completed

4. Check if there are any external objects

If there is any external objects take note of them, they will need to be taken care manually

SQL> set serveroutput on


SQL> declare
2 external boolean;
3 begin
4 /* value of external is ignored, but with SERVEROUTPUT set to ON
5 * dbms_tdb.check_external displays report of external objects

92
6 * on console */
7 external := dbms_tdb.check_external;
8 end;
9/
The following directories exist in the database:
SYS.DATA_PUMP_DIR, SYS.ORACLE_OCM_CONFIG_DIR, SYS.ADMIN_DIR,
SYS.WORK_DIR
PL/SQL procedure successfully completed.

5. Using the RMAN CONVERT DATABASE COMMAND

Having executed successfully the checkup steps, the database is open in read only
mode, then the convert database command can be executed with Rman.
In this example, I’m not using the db_file_name_convert '<source-path> '<target-
path>' option because the database is using oracle managed files (omf); when using
omf Rman does generate and display the new filenames on the output of the convert
database command. At the end of the convert process Rman does display information
about how to complete the conversion on the target platform.

C:\Documents and Settings\avargas>Rman target sys/oracle@orclw nocatalog


Recovery Manager: Release 10.2.0.4.0 - Production on Thus May 19 17:26:22 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: ORCLW (DBID=1718464921)
using target database control file instead of recovery catalog
RMAN> CONVERT DATABASE NEW DATABASE 'orcllnx'
2> transport script 'G:\TECHNOLOGY\ORCLLNX\transportscript'
3> to platform 'Linux IA (32-bit)';
Starting convert at 19-MAY-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=154 devtype=DISK

93
Directory SYS.DATA_PUMP_DIR found in the database
Directory SYS.ORACLE_OCM_CONFIG_DIR found in the database

Directory SYS.ADMIN_DIR found in the database


Directory SYS.WORK_DIR found in the database
User SYS with SYSDBA and SYSOPER privilege found in password file
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00001
name=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSTEM_44QYBFPF_.DBF
converted
datafile=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSTEM_44TM3OPF_.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:55
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00002
name=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_UNDOTBS1_44QYCFQW_.DBF
converted
datafile=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_UNDOTBS1_44TM5F98_.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00003
name=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSAUX_44QYCZ3D_.DBF
converted
datafile=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSAUX_44TM6JTB_.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00004
name=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_USERS_44QYDHY7_.DBF
converted
datafile=G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_USERS_44TM7BD5_.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01

94
Run SQL script G:\TECHNOLOGY\ORCLLNX\TRANSPORTSCRIPT on the target platform
to create
database
Edit init.ora file
C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA. This
PFILE will be used to create the database on the target platform
To recompile all PL/SQL modules, run utlirp.sql and utlrp.sql on the target platform
To change the internal database identifier, use DBNEWID Utility
Finished backup at 19-MAY-11

6. Copy Converted Datafiles, Generated Transport Script and Parameter File


to the Linux

We need to copy over the generated files to the Linux server, they include all converted
datafiles, the transportscript and the generated pfile.If needed create at this point the
directories you will need on the Linux server, for dump destination and flash
recovery area i.e.

mkdir –p /oradisk/oracle/app/admin/ORCLLNX/adump
mkdir –p /oradisk/oracle/app/admin/ORCLLNX/bdump
mkdir –p /oradisk/oracle/app/admin/ORCLLNX/cdump
mkdir –p /oradisk/oracle/app/admin/ORCLLNX/udump
mkdir –p /oradisk/database/ORCLLNX/FRA

O1_MF_SYSTEM_44TM3OPF_.DBF, then,

G:\TECHNOLOGY\ORCLW\DATAFILE>move *44TM* ..\..\ORCLLNX


G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSTEM_44TM3OPF_.DBF
G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_UNDOTBS1_44TM5F98_.DBF
G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_SYSAUX_44TM6JTB_.DBF
G:\TECHNOLOGY\ORCLW\DATAFILE\O1_MF_USERS_44TM7BD5_.DBF

95
G:\TECHNOLOGY\ORCLW\DATAFILE>move
C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA
..\..\ORCLLNX\initORCLLNX.ora

Check that all required files are located on the ORCLLNX directory

G:\TECHNOLOGY\ORCLW\DATAFILE>dir ..\..\ORCLLNX
Volume in drive G is TECHNOLOGY
Volume Serial Number is 1877-B4EA

Directory of G:\TECHNOLOGY\ORCLLNX

19/05/11 05:27 PM <DIR>


19/05/11 05:27 PM <DIR>
19/05/11 05:31 PM 2,616 TRANSPORTSCRIPT
19/05/11 05:31 PM 440,410,112 01_MF_SYSTEM_44TM30PF_.DBF
19/05/11 05:31 PM 246,423,552 01_MF_UNDOTBS1_44TM5F98_.DBF
19/05/11 05:31 PM 146,808,832 01_MF_SYSAUX_44TM6JTB_.DBF
19/05/11 05:31 PM 5,251,072 01_MF_USERS_44TM7BDS_.DBF
19/05/11 05:31 PM 1,556 initORCLLNX.ora
6 File(s) 838,897,740 bytes
2 Dir(s) 18,968,444,928 bytes free

7. Edit init.ora for new database

The Rman convert command executed on Windows generated a parameter file that
needs to be edited to be usedon the target Linux Server.

96
The pfile generated by Rman:

# Please change the values of the following parameters:


control_files = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\CF_DORCLLNX_
ID-1718464921_00JIGSKL"
db_create_file_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TECHNOLOGY"
db_recovery_file_dest =
"C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\flash_recovery_area"
db_recovery_file_dest_size= 2147483648
audit_file_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\ADUMP"
background_dump_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\BDUMP"
user_dump_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UDUMP"
core_dump_dest = "C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\CDUMP"
db_name = "ORCLLNX"
# Please review the values of the following parameters:
__shared_pool_size = 62914560
__large_pool_size = 4194304
__java_pool_size = 4194304
__streams_pool_size = 0
__db_cache_size = 88080384
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=orclwXDB)"
# The values of the following parameters are from source database:
processes = 150

sga_target = 167772160
db_block_size = 8192
compatible = "10.2.0.3.0"
db_file_multiblock_read_count= 16
undo_management = "AUTO"

97
undo_tablespace = "UNDOTBS1"
job_queue_processes = 10
open_cursors = 300
pga_aggregate_target = 16777216

The pfile edited to be used on Linux

# Please change the values of the following parameters:


control_files = '/oradisk/database/ORCLLNX/orcllnx_control1.ctl' ,
'/oradisk/database/ORCLLNX/orcllnx_control2.ctl'
db_create_file_dest = '/oradisk/database/ORCLLNX'
db_recovery_file_dest = '/oradisk/database/ORCLLNX/FRA'
db_recovery_file_dest_size= 2147483648
audit_file_dest = '/oradisk/oracle/app/admin/ORCLLNX/adump'
background_dump_dest = '/oradisk/oracle/app/admin/ORCLLNX/bdump'
user_dump_dest = '/oradisk/oracle/app/admin/ORCLLNX/udump'
core_dump_dest = '/oradisk/oracle/app/admin/ORCLLNX/cdump'
db_name = 'ORCLLNX'
# Please review the values of the following parameters:
__shared_pool_size = 62914560
__large_pool_size = 4194304
__java_pool_size = 4194304
__streams_pool_size = 0
__db_cache_size = 88080384
remote_login_passwordfile= 'EXCLUSIVE'
db_domain = ''
dispatchers = '(PROTOCOL=TCP) (SERVICE=orcllnxXDB)'
# The values of the following parameters are from source database:
processes = 150
sga_target = 167772160

98
db_block_size = 8192
compatible = '10.2.0.3.0'
db_file_multiblock_read_count= 16
undo_management = 'AUTO'
undo_tablespace = 'UNDOTBS1'
job_queue_processes = 10
open_cursors = 300
pga_aggregate_target = 16777216

8. Edit the Transport Script

Before running the transport script on the target Linux server we need to edit it to set the
correct paths for datafiles,controlfiles and dump directories, also we may want to
change the value for tuning parameters.The script generated by Rman:

-- The following commands will create a new control file and use it
-- to open the database.
-- Data used by Recovery Manager will be lost.
-- The contents of online logs will be lost and all backups will
-- be invalidated. Use this only if online logs are damaged.
-- After mounting the created controlfile, the following SQL
-- statement will place the database in the appropriate
-- protection mode:
-- ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE
STARTUP NOMOUNT
PFILE='C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA'
CREATE CONTROLFILE REUSE SET DATABASE "ORCLLNX" RESETLOGS
NOARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3

99
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 SIZE 50M,
GROUP 2 SIZE 50M,
GROUP 3 SIZE 50M
DATAFILE
'C: \ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\DATA_D-ORCLW_I-
1718464921_TSSYSTEM_
FNO-1_07JIGSKL',
'C: \ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\DATA_D-ORCLW_I-
1718464921_TSUNDOTBS1_
FNO-2_08JIGSMD',
'C: \ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\DATA_D-ORCLW_I-
1718464921_TSSYSAUX_
FNO-3_09JIGSNG',
'C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\DATA_D-ORCLW_I-
1718464921_TSUSERS_
FNO-4_0AJIGSOA'
CHARACTER SET AL32UTF8
;
-- Database can now be opened zeroing the online logs.
ALTER DATABASE OPEN RESETLOGS;
-- Commands to add tempfiles to temporary tablespaces.
-- Online tempfiles have complete space information.
-- Other tempfiles may require adjustment.
ALTER TABLESPACE TEMP ADD TEMPFILE
SIZE 20971520 AUTOEXTEND ON NEXT 655360 MAXSIZE 33554431M;
-- End of tempfile additions.
--

100
set echo off
prompt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
prompt * Your database has been created successfully!
prompt * There are many things to think about for the new database. Here
prompt * is a checklist to help you stay on track:
prompt * 1. You may want to redefine the location of the directory objects.
prompt * 2. You may want to change the internal database identifier (DBID)
prompt * or the global database name for this database. Use the
prompt * NEWDBID Utility (nid).
prompt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SHUTDOWN IMMEDIATE
STARTUP UPGRADE
PFILE='C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA'
@@?/rdbms/admin/utlirp.sql
SHUTDOWN IMMEDIATE
STARTUP
PFILE='C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00JIGSKL_1_0.ORA'
-- The following step will recompile all PL/SQL modules.
-- It may take serveral hours to complete.
@@ ?/rdbms/admin/utlrp.sql
set feedback 6;
The script edited to be run on Linux:

STARTUP NOMOUNT
PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ORA'
CREATE CONTROLFILE REUSE SET DATABASE "ORCLLNX" RESETLOGS
NOARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3

101
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 SIZE 50M,
GROUP 2 SIZE 50M,
GROUP 3 SIZE 50M
DATAFILE
'/oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF',
'/oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF',
'/oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF',
'/oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF'
CHARACTER SET AL32UTF8
;
-- Database can now be opened zeroing the online logs.
ALTER DATABASE OPEN RESETLOGS;
-- Commands to add tempfiles to temporary tablespaces.
-- Online tempfiles have complete space information.
-- Other tempfiles may require adjustment.
ALTER TABLESPACE TEMP ADD TEMPFILE
SIZE 20971520 AUTOEXTEND ON NEXT 655360 MAXSIZE 33554431M;
-- End of tempfile additions.
--
set echo off
prompt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
prompt * Your database has been created successfully!
prompt * There are many things to think about for the new database. Here
prompt * is a checklist to help you stay on track:
prompt * 1. You may want to redefine the location of the directory objects.
prompt * 2. You may want to change the internal database identifier (DBID)

102
prompt * or the global database name for this database. Use the
prompt * NEWDBID Utility (nid).
prompt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SHUTDOWN IMMEDIATE
STARTUP UPGRADE
PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ORA'
@@?/rdbms/admin/utlirp.sql
SHUTDOWN IMMEDIATE
STARTUP
PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ORA';
-- The following step will recompile all PL/SQL modules.
-- It may take several hours to complete.
@@?/rdbms/admin/utlrp.sql
Set feedback 6;

9. Execute the Transport Script

This is the log of the transport script execution:


avargas-pc:/oradisk/database/ORCLLNX> sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Jun 11 10:11:54 2008
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> @transport
SQL> STARTUP NOMOUNT
PFILE='/oradisk/oracle/app/oracle/product/10.2.0/db_1/dbs/initORCLLNX.ora';
ORACLE instance started.
Total System Global Area 167772160 bytes

103
Fixed Size 1266368 bytes
Variable Size 71306560 bytes
Database Buffers 88080384 bytes
Redo Buffers 7118848 bytes
SQL> CREATE CONTROLFILE REUSE SET DATABASE "ORCLLNX" RESETLOGS
NOARCHIVELOG
2 MAXLOGFILES 16
3 MAXLOGMEMBERS 3
4 MAXDATAFILES 100
5 MAXINSTANCES 8
6 MAXLOGHISTORY 292
7 LOGFILE
8 GROUP 1 SIZE 50M,
9 GROUP 2 SIZE 50M,
10 GROUP 3 SIZE 50M
11 DATAFILE
12 '/oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF',
13 '/oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF',
14 '/oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF',
15 '/oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF'
16 CHARACTER SET AL32UTF8
17 ;
Control file created.
SQL>
SQL> -- Database can now be opened zeroing the online logs.
SQL> ALTER DATABASE OPEN RESETLOGS;
Database altered.
SQL>
SQL> -- Commands to add tempfiles to temporary tablespaces.
SQL> -- Online tempfiles have complete space information.
SQL> -- Other tempfiles may require adjustment.

104
SQL> ALTER TABLESPACE TEMP ADD TEMPFILE
2 SIZE 20971520 AUTOEXTEND ON NEXT 655360 MAXSIZE 33554431M;
Tablespace altered.
SQL> -- End of tempfile additions.
SQL> --
SQL>
SQL> set echo off
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Your database has been created successfully!
* There are many things to think about for the new database. Here
* is a checklist to help you stay on track:
* 1. You may want to redefine the location of the directory objects.
* 2. You may want to change the internal database identifier (DBID)
* or the global database name for this database. Use the
* NEWDBID Utility (nid).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Database closed.
Database dismounted.
ORACLE instance shut down.
ORACLE instance started.
Total System Global Area 167772160 bytes
Fixed Size 1266368 bytes
Variable Size 71306560 bytes
Database Buffers 88080384 bytes
Redo Buffers 7118848 bytes
Database mounted.
Database opened.
SQL> WHENEVER SQLERROR EXIT;
SQL>
SQL> DOC
DOC>#######################################################################

105
DOC>#######################################################################
DOC> The following statement will cause an "ORA-01722: invalid number"
DOC> error if there the database was not opened in UPGRADE mode
DOC>
DOC> If you encounter this error, execute "SHUTDOWN", "STARTUP UPGRADE" and
DOC> re-execute utlirp.sql
DOC>#######################################################################
DOC>#######################################################################
DOC>#
SQL> SELECT TO_NUMBER ('MUST_BE_OPEN_UPGRADE') FROM v$instance
2 WHERE status! = 'OPEN MIGRATES';
no rows selected
SQL> Rem Store object numbers of all valid PL/SQL-based functional indexes
SQL> DROP TABLE utlirp_enabled_func_indexes;
Table dropped.
SQL> CREATE TABLE utlirp_enabled_func_indexes AS
2 SELECT obj# FROM ind$
3 WHERE bitand(property, 2048) != 0 AND bitand(flags, 1024) = 0;
Table created.
>>>>>>>>>>>>>> many output lines followed

These are the final lines of the logfile:


PL/SQL procedure successfully completed.
SQL>
SQL> DOC
DOC> the following query reports the number of objects that have compiled
DOC> with errors (objects that compile with errors have status set to 3 in
DOC> obj$). If the number is higher than expected, please examine the error
DOC> messages reported with each object (using SHOW ERRORS) to see if they
DOC> point to system misconfiguration or resource constraints that must be
DOC> fixed before attempting to recompile these objects.

106
DOC>#
SQL> select COUNT (*) "OBJECTS WITH ERRORS" from obj$ where status = 3;
OBJECTS WITH ERRORS
-------------------
0
SQL>
SQL>
SQL> DOC
DOC> the following query reports the number of errors caught during
DOC> recompilation. If this number is non-zero, please query the error
DOC> messages in the table UTL_RECOMP_ERRORS to see if any of these errors
DOC> are due to misconfiguration or resource constraints that must be
DOC> fixed before objects can compile successfully.
DOC>#
SQL> select COUNT (*) "ERRORS DURING RECOMPILATION" from utl_recomp_errors;
ERRORS DURING RECOMPILATION
---------------------------
0
SQL>
SQL>
SQL> Rem
=====================================================================
SQL> Rem Run component validation procedure
SQL> Rem
=====================================================================
SQL>
SQL> SET serveroutput on
SQL> EXECUTE dbms_registry_sys.validate_components;
PL/SQL procedure successfully completed.
SQL> SET serveroutput off
SQL>

107
SQL>
SQL> Rem
=====================================================================
======
SQL> Rem END utlrp.sql
SQL> Rem
=====================================================================
======
SQL> set feedback 6;

10. Change database identifier

To change the database identifier you need to use the NEWDBID utility “nid”. It is run
from within Sqlplus havingthe database mounted:

sqlplus "/ as sysdba"


startup mount
exit
To change the DBID
cd $ORACLE_HOME/bin
./nid target=/
To verify the DBID and database name
SELECT dbid, name FROM v$_database;

DBID Change, Execution Log:

avargas-pc:~/app/oracle/product/10.2.0/db_1/bin> sqlplus / as sysdba


SQL*Plus: Release 10.2.0.4.0 - Production on Wed Jun 11 10:25:09 2008
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

108
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.

Total System Global Area 167772160 bytes


Fixed Size 1266368 bytes
Variable Size 71306560 bytes
Database Buffers 88080384 bytes
Redo Buffers 7118848 bytes
Database mounted.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 -
Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
avargas-pc:~/app/oracle/product/10.2.0/db_1/bin> ./nid target=/
DBNEWID: Release 10.2.0.4.0 - Production on Wed Jun 11 10:25:50 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Connected to database ORCLLNX (DBID=1718464921)
Connected to server version 10.2.0
Control Files in database:
/oradisk/database/ORCLLNX/orcllnx_control1.ctl
/oradisk/database/ORCLLNX/orcllnx_control2.ctl
Change database ID of database ORCLLNX? (Y/[N]) => Y
Proceeding with operation
Changing database ID from 1718464921 to 1179074095
Control File /oradisk/database/ORCLLNX/orcllnx_control1.ctl - modified
Control File /oradisk/database/ORCLLNX/orcllnx_control2.ctl - modified
Datafile /oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF - dbid changed
Datafile /oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF - dbid
changed
Datafile /oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF - dbid changed

109
Datafile /oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF - dbid changed
Datafile /oradisk/database/ORCLLNX/ORCLLNX/datafile/o1_mf_temp_44yxofkr_.tmp
- dbid changed
Control File /oradisk/database/ORCLLNX/orcllnx_control1.ctl - dbid changed
Control File /oradisk/database/ORCLLNX/orcllnx_control2.ctl - dbid changed
Instance shut down
Database ID for database ORCLLNX changed to 1179074095.
All previous backups and archived redo logs for this database are unusable.
Database is not aware of previous backups and archived logs in Recovery Area.
Database has been shutdown, open database with RESETLOGS option.
Successfully changed database ID.
DBNEWID - Completed succesfully.
avargas-pc:~/app/oracle/product/10.2.0/db_1/bin> sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Jun 11 10:28:22 2008
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 167772160 bytes
Fixed Size 1266368 bytes
Variable Size 71306560 bytes
Database Buffers 88080384 bytes
Redo Buffers 7118848 bytes
Database mounted.
SQL> alter database open resetlogs;
Database altered

11. Check database integrity

SQL> select tablespace_name from dba_tablespaces;


TABLESPACE_NAME

110
------------------------------
SYSTEM
UNDOTBS1
SYSAUX
TEMP
USERS
SQL> select file_name from dba_data_files;
FILE_NAME
-------------------------------------------------------------------------------
/oradisk/database/ORCLLNX/O1_MF_USERS_44TM7BD5_.DBF
/oradisk/database/ORCLLNX/O1_MF_SYSAUX_44TM6JTB_.DBF
/oradisk/database/ORCLLNX/O1_MF_UNDOTBS1_44TM5F98_.DBF
/oradisk/database/ORCLLNX/O1_MF_SYSTEM_44TM3OPF_.DBF

Checking component status after transport:

SQL> SELECT COMP_NAME, STATUS FROM DBA_REGISTRY;


COMP_NAME STATUS
-------------------------------------- ------
Oracle XML Database VALID
Oracle Expression Filter VALID
Oracle Rules Manager VALID
Oracle Workspace Manager VALID
Oracle interMedia VALID
Oracle Database Catalog Views VALID
Oracle Database Packages and Types VALID
JServer JAVA Virtual Machine VALID
Oracle XDK VALID
Oracle Database Java Packages VALID

111
CHAPTER-7: DATABASE MIGRATION
TESTING

112
DATABASE MIGRATION TESTING

The approach to testing data and content migrations relies upon sampling, where some subset of
random data or content is selected and inspected to ensure the migration was completed “as
designed”. Those that have tested migrations using this approach are familiar with the typical
iterative test, debug and retest method, where subsequent executions of the testing process reveal
different error conditions as new samples are reviewed. Sampling works, but is reliant upon an
acceptable level of error and an assumption pertaining to repeatability. An acceptable level of
error implies that less than 100% of the data will be migrated without error and the level of error
is inversely proportionate to the number of samples tested(refer to sampling standards such as
ANSI/ASQ Z1.4). As per the assumption on repeatability, the fact that many migrations require
four, five or more iterations of testing with differing results implies that one of the key tenets of
sampling is not upheld, i.e., “non-conformities occur randomly and with statistical independence.

o We have to check database integrity.

o We have to check the referential integrity.

o Whether the data’s are loaded in the target database correctly.

o Check the constraints.

Even with these shortcomings, sampling has a role in a well defined testing strategy, but what are
the other testing options. The following lists options for testing by the phase of the migration
process: 

7.1 PRE-MIGRATION TESTING

These tests occur early in the migration process, before any migration, even migration for testing
purposes, is completed. The pre-migration testing options include:

113
 Verify scope of source systems and data with user community and IT. Verification should
include data to be included as well as excluded and, if applicable, tied to the specific
queries being used for the migration.
 Define the source to target high-level mappings for each category of data or content and
verify that the desired type has been defined in the destination system.
 Verify destination system data requirements such as the field names, field type,
mandatory fields, valid value lists and other field-level validation checks.
 Using the source to destination mappings, test the source data against the requirements of
the destination system. For example, if the destination system has a mandatory field,
ensure that the appropriate source is not null, or if the destination system field has a list of
valid values, test to ensure that the appropriate source fields contain these valid values.
 Test the fields that uniquely link source and target records and ensure that there is a
definitive mapping between the record sets
 Test source and target system connections from the migration platform.
 Test tool configuration against the migration specification which can often be completed
via black box testing on a field –by- field basis. If clever, testing here can also be used to
verify that a migration specification’s mappings are complete and accurate.

FORMAL DESIGN REVIEW

Conduct a formal design review of the migration specification when the pre-migration testing in
near complete, or during the earliest stages of the migration tool configuration.

The specification should include:

 A definition of the source systems


 The source system’s data sets and queries
 The mappings between the source system fields and the destination system
 Number of source records
 Number of source systems records created per unit time (to be used to define the
migration timing and downtime

114
 Identification of supplementary sources
 Data cleansing requirements
 Performance requirements
 Testing requirements

The formal design review should include representatives from the appropriate user communities,
IT and management. The outcome of a formal design review should include a list of open issues,
the means to close each issue and approve the migration specification and a process to maintain
the specification in sync with the migration tool configuration (which seems to continuously
change until the production migration).

7.2 Post-Migration Testing

Once a migration has been executed, additional end to end testing can be executed. Expect a
significant sum of errors to be identified during the initial test runs although it will be minimized
if sufficient pre-migration testing is well executed. Post-migration is typically performed in a test
environment and includes:

 Test the throughput of the migration process (number of records per unit time). This
testing will be used to verify that the planned downtime is sufficient. For planning
purposes, consider the time to verify that the migration process was completed
successfully.
 Compare Migrated Records to Records Generated by the Destination System – Ensure
that migrated records are complete and of the appropriate context.
 Summary Verification – There are several techniques that provide summary information
including record counts and checksums. Here, the number of records migrated is
compiled from the destination system and then compared to the number of records
migrated. This approach provides only summary information and if any issue exists, it
does not often provide insight to an issue’s root cause.
 Compare Migrated Records to Sources – Tests should verify that fields’ values are
migrated as per the migration specification. In short, source values and the field level

115
mappings are used to calculate the expected results at the destination. This testing can be
completed using sampling if appropriate or if the migration includes data that poses
significant business or compliance risk, 100% of the migrated data can be verified using
an automated testing tool.

The advantages of the automated approach include the ability to identify errors that are less
likely to occur (the proverbial needles in a haystack). Additionally, as an automated testing tool
can be configured in parallel with the configuration of the migration tool, the ability to test 100%
of the migrated data is available immediately following the first test migration. When compared
to sampling approaches, it is easy to see that automated testing saves significant time and
minimizes the typical iterative test, debug and retest found with sampling. 

User Acceptance Testing

Functional subtleties related to the co-mingling of migrated data and data created in the
destination system may be difficult to identify early in the migration process. User acceptance
testing provides an opportunity for the user community to interact with legacy data in the
destination system prior to production release, and most often, this is the first such opportunity
for the users. Attention should be given to reporting, downstream feeds, and other system
processes that rely on migrated data.

116
CHAPTER-8: SCREENSHOTS

117
118
119
120
121
122
123
CHAPTER-9: CONCLUSION OF PROJECT

124
CONCLUSION

Database migration is a complex project that has to be thoroughly analyzed. Determine the
migration type and the necessary tools and resources to complete it. Various approaches have to
be fully reviewed and understood prior to choosing one. Migration can be done in-house or
outsourced to a consulting practice. Consider using third-party solutions to prevent hand-coding
migration scripts that will be time consuming and rather complex if data cleansing is involved. In
addition, these solutions can reduce or eliminate the downtime and facilitate the replication
between the systems. Outsourcing will bring in the best practices and appropriate tools to
complete the project. This project over database migration is a value-add tool that enables you to
identify any issues with your database migration before the project is signed off as complete. The
combination of Oracle Recovery manager, Oracle JDeveloper Application Migration Assistant,
and Oracle Database Migration Verifier provides a comprehensive solution for database and
application migration to the Oracle platform.

Migration Task % of DB Best product % of


Effort Task
Migration Analysis 7% Migration 100%
Analyzer
Schema and Content 3% Schema & Data 100%
Conversion Migrator
Application 37% SQL Converter 95%
Component &          Runtime
Conversion Query Converter
Environment/Utility 1% SQL Converter 50%
Conversion
Product Test Planning 6% Migration Tester 50%
Product Test 39% Migration Tester 66%
Execution
Deployment 7% N/A 0%

125
Appendix 1

INDEX

A R

Analytical database 11 Relational Model 22


Architecture of database 16 RDBMS 32
B S
Backup And Replication 31 Step of migration 78
Business Migration 73 System requirement 79

C T
Command Notation 78 Three-Level Architecture 17
Coding 88 Task of DBMS Package 30
Choosing Linux 76 Tables 7

D W
Data migration 71 Window Manager 76
Database migration 73
DBMS Features 30
Data Flow Diagram 81
E
External database 12
External data level 18

K
Key 7

126
Appendix 2
GLOSSARY

A
Attribute - A single data item related to a database object

C
Candidate Key - A candidate key is a combination of attributes that can be uniquely used to
identify a database record without any extraneous data.
Column- Database tables are composed of individual columns corresponding to the attributes of
the object.

D
Data - Data consists of a series of facts or statements that may have been collected, stored,
processed and/or manipulated but have not been organized or placed into context.
DB2- DB2 is a relational database system developed by IBM Corporation, originally for use on
large mainframe computer systems.

E
ER Diagram- An entity-relationship (ER) diagram is a specialized graphic that illustrates the
interrelationships between entities in a database.

K
Key - A database key is a attribute utilized to sort and/or identify data in some manner.

R
Report- A database report presents information retrieved from a table or query in a preformatted,
attractive manner.

S
Schema- A database schema is a collection of meta-data that describes the relations in a database

127
REFERENCES

[1]. Van der Hoeven, Jeffery, Brian Lohman, and Remco Verdegem. "Emulation for Digital
Preservation in Practice: The Results." The International Journal”Curation 2.2 :
{121-132}.
[2]. Muira, Gregory. "Pushing the Boundaries of Traditional Heritage Policy: maintaining
long-term access to multimedia content." IFLA Journal 33: 323-326.
[3]. Migrating Legacy Systems, by Brodie and Stonebraker
[4]. Database migration – “A planned approach” by Steve Callan

LINKS

[1]. www.databasejournal.com/database migration


[2]. www.wikisearch.com/databaseconcepts

128

You might also like