The Rapid Application and Database Development RAD
The Rapid Application and Database Development RAD
The Rapid Application and Database Development RAD
net/publication/227307185
CITATIONS READS
6 72
5 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Antje Raab-Düsterhöft on 01 June 2014.
Abstract. In this paper we present a tool for database design which supports designers eciently
and informally to achieve correct and ecient databases.
1 Introduction
The performance of a database (especially eciency and consistency) heavily depends on design decisions.
In order to achieve an eective behaviour of the database, database designers are requested to nd the
best structure and the simplest basic database operations. The result of the database design process
depends on the professionality of the designer and the quality of the support by a database design
system. Therefore development of a comfortable database design system is an important task.
In this paper we want to present a database design system which is adaptable to a designer. It contains
components which enable even novice or unskilled users the design of correct databases. An extensive
support of database designers in choosing design strategies and checking correctness of design steps is
contained in the approach. These components also use natural language to acquire information about
databases and discuss design decisions by means of examples. Another part of the workbench transforms
designed databases into equivalent and more ecient databases.
The dierent components in the workbench work closely together. The designer does not need to enter
an information twice, all components communicate via a DataDictionary. Therefore, the designer can
decide which support he/she wants to use for every design task.
In the system we use a special extension of the entity-relationship model which can be used to represent
structural, semantic and behavioural information. The workbench is currently implemented in a joint
project of dierent groups [BOT90] in Cottbus, Dresden, Munster and Rostock. We want to explain
main parts of the workbench in this tool.
3 Overview of RADD
The RADD (Rapid Application and Database Development) developed in our groups does not require
the user to understand the theory, the implementational restrictions and the programming problems in
order to design a database scheme. A novice designer can create a database design successfully using the
system. These tools are based on an extended entity-relationship model. The entity-relationship model
is extended to the Higher-order Entity-Relationship Model (HERM) by adding structural constructs and
using integrity constraints and operations.
Basically, the system in Figure 1 consists of three major components:
{ Design Tools: Since designers require dierent kinds of representation the design tools support graph-
ical, procedural and logical techniques for application specication.
Graphical Editor: The system is based on an extended entity-relationship model which allows the
user to specify graphically the structure of an application, the integrity constraints which are
valid in the given application and the processes, operations and transactions which are necessary
for the given application. This extension requires an easy-to-handle and advanced support for
graphics.
Customizer/Strategy support: The user interface is adapted to skills, abilities and intentions of
the database designer. This tool allows customization of the user interface. It controls user steps
and corrects the user interface according to designer actions.
The designer is supported in choosing an appropriate database design strategy. Based on the
chosen design strategy this tool controls and veries design steps. Especially, completeness and
consistency of a given specication is testied.
Acquisition support: Acquisition of specications can be supported by dierent strategies. This
tool uses learning approaches for acquisition of structure, semantics and operations. The user
interface of this tool is an example discussion.
Natural language support: The designer who is able to express properties of his application based
on natural language can be supported by moderated dialogues. During such dialogues the designer
renes his/her current design. The system validates whether the specication meets certain
completeness requirements. The system RADD supports German language in a specic manner.
The structure and semantics of German sentences can be used for the extraction of structural,
semantic, operational and behavioural specication.
Version manager and reverse/reengineering tool: The design system stores versions of current, pre-
vious and sample specication. These specications can be partially or completely included into
current specications or can be used for replacing parts of current specications. This tool enables
the designer to browse through given specication and to reuse already existing specications.
{ Optimization: In the tool there is a component which tries to nd for a drafted database an equivalent
and more ecient database.
Behavior estimation: Based on frequency, priority and semantics of operations the complexity of
the current database can be estimated in dependence of implementational techniques used by a
chosen class of DBMS.
Behavior optimization: Based on the results of behaviour estimation this tool discusses with a
designer various possibilities for redesign and improvement of database behaviour. Improvement
includes modication and optimization of database schemata, their corresponding integrity con-
straints and operations.
{ Translator: This component translates the result of the design process into the language of a specic
database management system that is used in the given application. This tool is developed on the
University at Munster (Germany) and therefore not described in this paper.
?User Interactions
Graphical Editor/ Main Menu
? 6 ? 6 ? 6 ? 6
Customizer/Stra- Acquisition Natural Lan- Version manager,
tegy Support Support guage Support reengineering
Design Tools
6optimized databases
designed databases
?
Behaviour Optimization
? 6
designed
databases
Behaviour Estimation
Optimization
optimized databases
? ?
Translator
?translated databases
Figure 1: Overview of the RADD workbench
In the next sections we want to represent some main parts of the workbench more extensively. In section
4 we show how designers can be supported in the design steps individually. Section 5 describes a part of
the acquisition support. We chose the informal and ecient acquisition if semantic constraints because
this is a dicult task in database design and because semantic acquisition is NP complete. The next
section demonstrates the natural language support.
Section 7 represents in which manner database drafts can be transformed into equivalent optimal
databases. Therefore, behaviour estimation and behaviour optimization are demonstrated. In section
8 we give a conclusion of the paper.
4 User Guidance
There is a variety of database design strategies proposed in the literature (e.g. [Ris88, FlH89, BCN92,
FuN86, Leo92]). Most approaches propose that the designer is mainly using one strategy. However,
observations on database designers show that they use dierent strategies depending on the application
properties, their skills and abilities and the chosen platform for implementation. This change in strategy
necessitates a support for database designers. Further, database designers change their strategy according
to the reached stage of the scheme. Any change in strategy makes the design task more complex. Since
database designers normally delay some of the design decisions, any change in strategy requires deliberate
adaptation. For this reason, experienced database designers need a strategy support as well. Therefore,
a user guidance tool is currently developed and included into RADD. This tool supports the designer
{ in developing and applying his/her own strategy,
{ in discussing decisions in designer teams,
{ in tracing delayed design decisions, and
{ in customizing the design workbench.
It is based on
{ a framework for deriving design strategies,
{ a user model, and
{ a customizable graphical environment.
The Framework for Adaptable Database Design Strategies. Each design strategy needs to sup-
port a designer during a consistent development of database schemes. Thus, a strategy should be con-
tent preserving, constraint preserving, minimality preserving and update simplicity preserving [Yao85].
Analysing known design strategies, primitive steps can be extracted. These primitives are the basis for
the strategy support in RADD. The designer can choose his/her own strategy and compose the strategy
from primitives. The system RADD supports this choice based on the user model, user preferences, and
characteristics of the application. On this basis, the designer can switch among bottom-up, top-down,
modular, inside-out and other strategies. The controller derives graph grammar rules for the maintenance
of consistency and for the specic support of designers. The variety of dierent strategies is based on
the dimensions of database design which are design directions like the top-down or bottom-up approach
or mixed strategies, control mechanism of design strategies and the modularity concept. Strategy consul-
tancy and error control are included in the user guidance tool. Besides the primitives each design strategy
is related with a set of checkpoints. The design of the unit `borrow' in our library example involves a
rough specication on the interface. The checkpoint after this step examines the interface description
and the skeleton of the library example.
The Type of the User Modelling Component. The model of the user is represented explicitly within
the design environment. The customizer makes the system adaptable { that is, the user can make choices
among various options that eect the system's behaviour, e.g. the user may enable or disable dierent
design strategies or restrict the output of types with respect to a certain type of cardinalities. Beyond
the customizer the adaptive component automatically acquires knowledge about the users (designers),
updates this knowledge over time, and uses the knowledge to adapt to the user's requirements. In this
way it is a dynamic user model. Our user model contains knowledge about individual users (a set of
proles, see below) and knowledge about classes of users (e.g. users experienced in relational database
design).
Figure 2: Screendump of the Graphical Editor, Strategy Support, and Customizer
The Sort of Information in our User Model. The user of the database design environment is
classied regarding his properties, his capabilities, his preferences (kinds of input, output and dialog)
and his system knowledge, his application knowledge and his knowledge about design concepts and
design strategies, and he will be supported with respect to this classication.
How is the User Information Acquired? The users knowledge and actions are analysed with the
aim to propose the most appropriate design strategy before the scheme transformation process starts and
the most likely next design step after each user design action. The user analysis is divided into two parts:
the direct analysis (user's answers in an interrogation) and the indirect analysis (user's actions in the
design environment, e.g. selection of types in the diagram, or selection of commands/ design primitives).
In this way, the analysis contains both explicit and implicit acquisition of user model information.
Representing User Characteristics. Individual users are represented as a collection of frames. Gen-
eral frames store user-specic information which is long-term static information with respect to the user
classication. Action frames store information on the use of particular actions. A user action results
in the execution of a scheme transformation applying one design primitive. Each design primitive is
described by a starting subscheme and a resulting subscheme and has a graphical representation and im-
plementation. Therefore, action frames, the so called user proles, contain small design-state-dependent
patterns of the user behaviour.
When user information is needed (timing of adaptation) the system determines what are the properties
of the current design state and collects proles which agree with this. The collection of all these pieces
of information is then accumulated in one large user model called the \focus".
How is the Information Used? User adaptation includes the derivation of the appropriate adaptation
forms (e.g. context-sensitive help, design primitive, design strategy), the dimension of adaptation (e.g.
the degree of support and extent of help) and the extent and style of information the user is able to
consume about adaptation as well as the realization of the user adaptation. It is a learning approach,
the system learns from earlier design steps, evaluates them and implies the most likely next design step.
Keys, functional dependencies, exclusion dependencies and cardinality constraints can also be inquired
by an example discussion.
Ecient Acquisition of Unknown Semantic Constraints: The set of unknown constraints which must be
checked can be very large. The number of independent functional dependencies and keys of a relation is
O(2n ) where n is the number of attributes of a relation. The number of unary inclusion and exclusion
dependencies is O(n2 ) where n is the number of attributes of the whole database.
Therefore, not all unknown constraints can be checked one after the other. Heuristic rules are necessary
which estimate the probability of the valididity of unknown constraints. These heuristics can use much
vague information about the database which re
ects background knowledge of the designer. Structural
information, already known semantic information, sample relations and sample transactions (if they are
known) are utilized in the heuristic rules.
We want to show only some heuristics.
From attribute names keys are sometimes derivable if the substrings -name-, -number-, -id-, -#- occur in
the names. Similar attribute names can indicate existing inclusion or exclusion dependencies or foreign
keys. The same values in the instances also point to inclusion and exclusion dependencies or foreign keys.
From transactions specied on the databases we can conclude which attributes are probably keys, and
which functional and inclusion dependencies must hold.
These heuristics try to exploit the background knowledge of the designer that is already implicitely
contained in the database.
First, those constraints which seem to be valid are inquired with the example discussion.
There is another possibility to speed up acquisition of semantic constraints. First those candidates for
semantic constraints are inquired from which most other unknown constraints are derivable.
In that way unknown semantic constraints are inquired in an informal and ecient way. The designer
needs not to be able to enter formal semantic constraints, but with this tool he can determine valid
constraints. These constraints are necessary in the optimization of databases.
Interpreting the semantics of the designer input we are using the two-step model that contains the word
semantics and the semantics of the sentence. Verbs form the backbone of the sentences. We have tried
to nd a classication of verb semantics that can be applied to all verbs in the German language. This
classication is, at this stage, independent of the domain to be analysed. To identify the meaning of
sentences we have used the model of semantic roles. The units in a sentence are seen to full certain
functional roles corresponding to the verb classication.
Example. 'The user borrows a book with a borrowing-slip'
semantic analysis:
verbtype: verb of movement
subject: the user
object: a book
locative:? (additional question)
temporal: ? (additional question)
mode of movement: with a borrowing-slip
Pragmatic Component The aim of the pragmatic interpretation is the mapping of the natural language
input onto HERM model structures using the results of the syntactic and semantic analyses. We handle
this transformation as a compiler process. An attribute grammar with common rules and heuristics as
semantic functions form the basis of the transformation. Common rules will be used for analysing the
syntax tree structures of the syntactic analysis.
The heuristics are expressed in contextfree and contextsensitive rules and represent assumption needed
for the transformation of natural language phrases into HERM structures. (E.g. The close relation of
nouns and entities or the close relation of verbs and relationships.)
Example. Heuristic: Sentences with forms of the verb 'be' and an object are is-a classications.
Rule: N(X),subject(X),V(be),N(Y),object(Y) ! entity(X),entity(Y),super(X,Y).
Procedures for extracting semantic information are also started from the transformation grammar if
special words are identied. Within the knowledge base these words will be marked.
Example. Heuristics: Special words which have the character of numbers will be key candidates. E.g.
ISBN, street number, registration number.
Information on behaviour can be best gained from a knowledge base. Special words indicate the occurence
of processes. If such words are recognized a process classication will be applied in order to capture the
according post, main and preprocesses.
Example. 'The user borrows a book with a borrowing-slip.'
process analysis: borrow - synonym: lending
lending: preprocess: obtaining, registration
lending: postprocess: returning
The advantage of this strategy is the possiblity to connect not only word classes (e.g. nouns, verbs,
adjectives) with HERM structures but also phrases of the sentences (e.g. genitive phase) and HERM
structures.
7 Behavioural Optimization
Traditional database design is still based on waterfall approaches. The designer starts with requirement
analysis, designs the conceptual scheme, normalizes this scheme and translates it into the logical scheme.
Later, during physical design tuning of logical schemes becomes necessary. During the tuning phase
operational behaviour is considered in detail. Processing requirements (planning, control, operations)
are now taken into account. However, operational information is available during conceptual design.
There are several reasons why processing information needs to be considered during conceptual design:
{ Eciency is one of the main database processing requirements. Low storage complexity and low
operational complexity is an objective of database design. It can be reached using dierent methods.
Normalization is one approach during conceptual design. Tuning is the main approach during logical
and physical design. E.g. search operations on book information and update operations on 'borrow'
are used more frequently than adding new readers to the library database.
{ As the normalization examples in [Ull82] show, an algorithm that derives a certain normal form can
generate dierent normalized data schemes from one and the same set of attributes and functional
dependencies. We need to choose the scheme with the `best' operational behaviour. In our example,
any normalization which separates 'borrow' is less ecient than those which allows to treat 'borrow'
as a unit.
{ A `good' conceptual scheme does not provide a good physical realization automatically. Additionally,
for some of those physical schemes { derived with the help of statical aspects only { it can be shown
that a more optimal conceptual base scheme exists which is not directly derivable from the given
conceptual scheme. If this scheme is the basis of the database system then external views cannot be
directly supported. E.g. the book information view should be directly derivable from the conceptual
schema.
{ Integrity enforcement aims at deriving mechanisms for simple extended operations which never vio-
late integrity constraints in the case that the database is correct before application of the operation.
For instance, removing a reader from the library requires the return of all books etc. lent by this
reader. The remove operation is a complex operation which can be derived from structural and
semantical knowledge obtained during database design.
{ Further, methods of dependency protection at the physical level are of another nature than those at
the conceptual level: integrity constraints result in operational dependencies used for maintenance of
the database system. E.g., simple access is supported by implementational approaches like surrogate
keys, identiers or keys like the ISBN number. The last key is not practical for the library user.
Therefore modern database design approaches advice inclusion of processing information into the con-
ceptual design phase as well. This is possible:
{ During conceptual design an important part of the knowledge which is used for tuning schemes is
already available [CoG93, Sha92, Su85].
{ The internal database representation is based on well-known data structuring techniques. The com-
plexity of corresponding operational support is well-known [KoS91, Wie87].
{ There exists a general mechanism [SST94, ScT94] which allows the computation of integrity main-
tening procedures for the database implementation.
Therefore, RADD enhances the conceptual database design phase by considering operational behaviour
too. Elimination of operational bottlenecks and optimization can already be performed at the time
of conceptual design. This provides a more application-oriented database development. This does not
require complete logical modelling during conceptual design. Only those steps are considered which are
necessary for the estimation of operational behaviour.
This approach is more complex. Thus, we have developed a tool for behavioural estimation of physical
behaviour and another tool for behavioural optimization.
Behavioural Estimation: The estimizer is based on the following framework.
+ Since the trigger approach is not powerful enough we use the approach [ScT93] for the construction
of consistent update operations and extended transactions. It is based on linguistic re
ection. Type-
safe linguistic re
ection came up with the development of the ADABTPL language which has been
used for deriving correct database transactions [StS90, StS91].
+ Two dierent cost estimation models are implemented based on characteristics of chosen DBMS.
* The basic model is based on the conceptual information: structure, semantics, frequency and priority
of operations.
* The enhanced model is using further information on class size, index formats and on implementation
strategies
+ The tool is currently extended by an expert system which is used to discuss with the designer
certain bottlenecks in accordance with the chosen application scenario. We have included into the
system dierent tuning rules known for some of the major DBMS. The designer introduces the main
operational requirement (e.g. frequency and priority of operations and transactions). The system
generates better design decisions and displays it for further discussion.
Behavioural Optimizer: Based on analysis of operational behaviour generated by the estimizer, the
behavioural optimizer discusses with the designer various alternatives in the conceptual schema.
The tool proposes to him/her alternative schemes with estimations of operational behaviour. The designer
now can choose one of the alternative schemes. If he/she chooses one of the proposed schemes the RADD
workbench computes the changed scheme on the basis of a replacement algorithm. To enable judgement
of operational tness on the conceptual level, transformation heuristics are used for derivation from the
conceptual schemes.
The designer can choose further which scheme is kept for interfaces in database applications: the original
scheme or the optimized scheme. But, the optimized scheme is used internally and for the translation
input. If the designer chooses the original scheme then the tool generates the corresponding adapters for
the treatment of the original scheme, too.
The `Optimizer' is implemented in Standard-ML as a collection of abstract data types which are providing
the representation of entity and relationship types, data dependencies, data manipulation operations
and transactions. The system will be extended in order to include also more complex transactions and
parametric queries.
8 Conclusion
The workbench RADD currently under development is intended to become a third generation design
system. Convenient design is supported by an advanced and powerful user interface. RADD allows the
specication of structure, semantics and behaviour in consistent manner. The database designer can use
dierent design strategies and is supported even if the chosen strategy is to be changed. Since users prefer
expressing their application on the basis of natural languages the design system RADD is able to extract
structural, semantical and operational specication of an application from sentences stated in natural
languages. The natural language interface provides an ecient support for this facility. Another impor-
tant advantage of the workbench is that the database designer gets estimations on operational behaviour
during conceptual database design and is supported during optimization of conceptual schemes.
References
[AlT92] M.Altus, B.Thalheim. Design by Units and its Graphical Implementation. In: Kurzfassungen des 4.
GI-Workshops \Grundlagen von Datenbanken", technical report ECRC-92-13, Barsinghausen, 1992.
[Alt94] M.Altus. A User-Centered Database Design Environment. In: The Next Genaration of Case Tools,
Proceedings of the fth Workshop on NGCT, Utrecht, The Netherlands. 1994.
[BCN92] C. Batini, S. Ceri, and S. Navathe, Conceptual database design, An entity-relationship approach. Ben-
jamin Cummings, Redwood, 1992.
[RoR89] A. Rosenthal and D. Reiner, Database design tools: Combining theory, guesswork, and user interaction.
Proc. 8th ER-Conference, 1989
[BDT94] Edith Buchholz, Antje Dusterhoft, Bernhard Thalheim, Exploiting Knowledge Gained from Natural
Language for EER Database Design, Technische Universitat Cottbus, Reihe Informatik I-10/1994, Ger-
many
[BOT90] P. Bachmann, W. Oberschelp, B. Thalheim, and G. Vossen. The design of RAD: Towards an interactive
toolbox for database design. RWTH Aachen, Fachgruppe Informatik, Aachener Informatik-Berichte, 90-
28, 1990.
[CoG93] P. Corrigan and M. Gurry, ORACLE Performance Performance. O'Reilly & Associates, Inc., 1993.
[FlH89] C.C. Fleming and B. von Halle, Handbook of relational database design. Addison-Wesley, Reading, 1989.
[FuN86] A.L. Furtado and E.J. Neuhold, Formal techniques for database design, Springer, Heidelberg, 1986.
[Kok90] A.J.Kok. User Modelling for Data Retrieval Applications. Vrije Universiteit Amsterdam, Faculteit
Wiskunde en Informatica. 1990.
[Leo92] M. Leonard, Database design theory. Macmillan, Houndsmills, 1992.
[MaR 92] Heikki Mannila, Kari-Jouko Raiha, The Design of Relational Databases, Addison- Wesley 1992
[KoS91] H.F. Korth and A. Silberschatz, Database System Concepts. McGraw-Hill, 1991.
[Ris88] N. Rishe. Database Design Fundamentals. Prentice-Hall, Englewood-Clis, 1988.
[RoS87] L.A. Rowe and M.R. Stonebreaker. The POSTGRES Data Model. In Proceedings of the Thirteenth
International Conference on Very Large Data Bases, pages 83 { 96, Brighton, UK, September 1987.
[ScT93] K.-D. Schewe and B. Thalheim, Fundamental Concepts of Object Oriented Concepts. Acta Cybernetica,
11, 4, 1993, 49 { 81.
[ScT94] K.D. Schewe and B. Thalheim, Achieving Consistence in Active Databases. Proc. Ride-Ads, 1994
[Sha92] D.E. Shasha, Database Tuning { A Principle Approach. Prentice Hall, 1992.
[SiM 81] A.M. Silva, M.A. Melkano, A method for helping discover the dependencies of a relation, In Advance
in Database Theory, eds H. Gallaire, J. Hinker, J.-M. Nicolas, Plenum Publ. 1981, S 115-133
[SST94] K.-D. Schewe, D. Stemple and B. Thalheim, Higher-level genericity in object-oriented databases. Proc.
COMAD (eds. S. Chakravarthy and P. Sadanandan), Bangalore, 1994
[StG 88] Veda C. Storey, Robert C. Goldstein, Methodology for Creating User Views in Database Design, ACM
Transactions on Database Systems, Sept. 1988, pp 305-338
[StS90] D. Stemple and T. Sheard, Construction and calculus of types for database systems. Advances in
Database Programming Languages (eds. F. Bancilhon, P. Buneman), Addison-Wesley, 3 - 22, 1990.
[StS91] D. Stemple and T. Sheard, A recursive base for database programming primitives. Next Generation
Information System Technology (eds. J.W. Schmidt, A.A. Stognij), LNCS 504, 311 - 332, 1991.
[StT94] M. Steeg and B. Thalheim. Detecting Bottlenecks & Computing better Operational Behavior on Con-
ceptual Data Schemes, October 1994. (submitted).
[Su85] S.S. Su, Processing-Requirement Modeling and Its Application in Logical Database Design. In Principles
of Database Design (ed. S.B. Yao), 1: Logical Organization, 151 -173, 1985.
[TAA94] B.Thalheim, M.Albrecht, M.Altus, E.Buchholz, A.Dusterhoft, K.-D.Schewe. The Intelligent Toolbox
for Database Design RAD (in german). GI-Tagung, Kassel, Datenbankrundbrief, Ausgabe 13, Mai 1994,
p.28{30.
[Tha92] B. Thalheim, The database design system (DB )2 . Database - 92. Proc. Third Australian Database
Conference, Research and Practical Issues in Databases, (eds. B. Srinivasan, J. Zeleznikow), World
Scientic, 279{292, 1992.
[Tha93] B.Thalheim. Database Design Strategies. Computer Science Institute, Cottbus Technical University,
D-03013 Cottbus, 1993.
[Ull82] J. D. Ullman Principals of Database Systems. Computer Science Press, Rockville, MD, 1982.
[Wie87] G. Wiederhold, File Organization for Database Design. McGraw-Hill, 1987.
[Yao85] S.B. Yao (ed.) Principles of database design, Volume I: Logical organizations. Prentice-Hall, 1985.
This article was processed using the LATEX macro package with LLNCS style