Tejas 22-10-24

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

3.

Advanced DBMS (IT13)


Unit 1: Database Design and SQL Query Processing

1.What is an ER diagram? Draw an ER diagram for an online bookstore.

An ER (Entity-Relationship) diagram is a visual representation of the relationships between entities


(which are objects or concepts that are represented in a database). It illustrates how data is structured and
how entities interact with each other. The key components of an ER diagram are:

 Entity: A real-world object or concept (like "Customer" or "Book").


 Attributes: Characteristics or properties of entities (like "Name" for Customer, "Price" for Book).
 Relationships: How entities are related to each other (like a Customer "buys" a Book).

ER Diagram for an Online Bookstore

Entities for an online bookstore might include:

1. Customer: A person who buys books.


o Attributes: Customer_ID, Name, Email, Address, Phone
2. Book: The books available in the store.
o Attributes: Book_ID, Title, Author, ISBN, Price, Stock
3. Order: Represents a customer’s purchase.
o Attributes: Order_ID, Date, Total_Amount, Payment_Method
4. Payment: Details of payment for an order.
o Attributes: Payment_ID, Payment_Date, Amount, Payment_Type
5. Shopping_Cart: Temporary storage of a customer's selected books before purchase.
o Attributes: Cart_ID, Date_Created
6. Order_Item: Represents the books in a specific order.
o Attributes: Quantity, Price_Per_Unit
7. Category: The classification of books.
o Attributes: Category_ID, Category_Name
2.Explain Codd's rules for relational databases.

Codd’s Rules for relational databases, proposed by Dr. Edgar F. Codd in 1985, are a set of thirteen rules
(numbered 0 to 12) that define what is required for a database management system (DBMS) to be
considered relational. These rules provide a framework for relational database design, ensuring that
databases are structured in a consistent, flexible, and efficient way.

Codd’s 12 Rules (Including Rule 0)

Rule 0: Founda on Rule

The system must qualify as a relational database management system (RDBMS) by using relational
techniques for data storage, retrieval, and manipulation. This rule ensures that all other rules are applied.

Rule 1: Informa on Rule

All data must be stored in table format, with rows and columns. The data should be accessible by table
name, column name, and primary key values. This rule formalizes the table structure.

Rule 2: Guaranteed Access Rule

Every piece of data (atomic value) in a relational database must be accessible without ambiguity by using a
combination of the table name, primary key, and column name. This guarantees direct access to any data.

Rule 3: Systema c Treatment of Null Values

The database must support null values (missing or inapplicable information) in a systematic way. Nulls
should be treated independently from other data and should not affect the database's integrity.

Rule 4: Ac ve Online Catalog

The database must have an online, accessible catalog (metadata) that describes the database's structure. The
catalog should be queryable using the same query language as the database itself.

Rule 5: Comprehensive Data Sub-Language Rule

The database must support a comprehensive language that allows for data definition, manipulation,
querying, and transactions. SQL (Structured Query Language) is a widely used example of such a language.

Rule 6: View Upda ng Rule

Any view (virtual table) that is theoretically updateable must be updatable through the system. Views should
not only be read-only but should allow users to make changes that reflect in the underlying tables.

Rule 7: High-Level Insert, Update, Delete

The database must support insert, update, and delete operations at the level of a set of records, rather than
requiring operations on one record at a time. This allows for bulk data manipulation.

Rule 8: Physical Data Independence

Changes to the physical storage of data (e.g., how files are organized on disk) should not affect how data is
accessed or structured at the logical level. This ensures that users and programs are insulated from physical
changes.
Rule 9: Logical Data Independence

Changes to the logical structure of the database (such as adding or removing fields or tables) should not
require rewriting application programs. This enables the database structure to evolve without affecting user
interaction.

Rule 10: Integrity Independence

The database must have integrity constraints (such as primary keys, foreign keys, and uniqueness
constraints) that are independent of the application. These rules should be stored within the database and
enforced by the DBMS.

Rule 11: Distribu on Independence

The database should work in a distributed environment (across multiple physical locations), but users should
not notice any differences whether the data is stored on one machine or distributed across many.

Rule 12: Non-Subversion Rule

If a system provides a lower-level language (like record-at-a-time access), it should not be able to bypass the
integrity rules or security constraints defined at the relational level. This ensures that all operations go
through the relational interface.

3. What are the steps involved in converting ER diagrams into relational tables?

1. Convert Entities into Tables

 Each entity in the ER diagram is converted into a table (relation) in the relational schema.
 The attributes of the entity become the columns of the table.
 The primary key of the entity becomes the primary key of the table.

2. Convert Relationships into Tables

 One-to-One (1:1) Relationship:


In most cases, you can either:
o Merge the two entities into a single table, or
o Add a foreign key in one of the tables to reference the other

3. Map Attributes to Columns

 Each attribute of an entity in the ER diagram becomes a column in the corresponding table.
 Ensure that any composite attributes are broken down into their component attributes and added as
separate columns.
 Derived attributes (attributes that can be calculated from other attributes) are generally not stored as
columns in the table; they are calculated as needed.

4. Handle Multivalued Attributes

 Multivalued attributes (those that can have multiple values for a single entity, like a person having
multiple phone numbers) are typically represented by creating a separate table.
 This new table will have a foreign key that references the primary key of the original table and a
column for the multivalued attribute.
5. Convert Weak Entities

 Weak entities (entities that depend on another entity for identification) are represented by creating a
table, and the table must include a foreign key that references the parent entity.
 The primary key of a weak entity table is usually a combination of the primary key of the related
strong entity and some partial key of the weak entity.

6. Enforce Integrity Constraints

 Primary Key Constraints: Ensure that each table has a primary key to uniquely identify records.
 Foreign Key Constraints: Define foreign keys to maintain referential integrity between related
tables.
 Unique Constraints: Apply unique constraints where applicable (e.g., Email should be unique for
Customer).
 Not Null Constraints: Specify attributes that cannot be null, such as the primary key.

7. Normalization (Optional)

After mapping entities and relationships into tables, you may need to normalize the database to reduce
redundancy and ensure that the tables are in the appropriate normal forms (e.g., 1NF, 2NF, 3NF).
Normalization eliminates data anomalies by breaking down tables into smaller, more structured tables.

4.Write SQL queries to perform JOIN operations on two tables

1. Customers:

Customer_ID Name City


1 Alice New York
2 Bob Chicago
3 Charlie Los Angeles
4 David Houston

2. Orders:

Order_ID Order_Date Customer_ID Total_Amount


101 2024-01-01 1 500
102 2024-01-02 2 300
103 2024-01-03 1 250
104 2024-01-04 3 150

1. INNER JOIN:

The INNER JOIN returns rows that have matching values in both tables. If there’s no match, the row is
excluded from the result.
Code:

SELECT Customers.Name, Orders.Order_ID, Orders.Total_Amount

FROM Customers

INNER JOIN Orders

ON Customers.Customer_ID = Orders.Customer_ID;
Result:
Name Order_ID Total_Amount

Alice 101 500

Bob 102 300

Alice 103 250

Charlie 104 150

2. LEFT JOIN (or LEFT OUTER JOIN):

The LEFT JOIN returns all rows from the left table (Customers), and the matched rows from the right table
(Orders). If there is no match, the result is NULL on the right side.

Code:

SELECT Customers.Name, Orders.Order_ID, Orders.Total_Amount

FROM Customers

LEFT JOIN Orders

ON Customers.Customer_ID = Orders.Customer_ID;
Result:
Name Order_ID Total_Amount

Alice 101 500

Bob 102 300

Alice 103 250

Charlie 104 150

David NULL NULL

3. RIGHT JOIN (or RIGHT OUTER JOIN):

The RIGHT JOIN returns all rows from the right table (Orders), and the matched rows from the left table
(Customers). If there’s no match, NULL values are returned for the left table.
Code:

SELECT Customers.Name, Orders.Order_ID, Orders.Total_Amount

FROM Customers

RIGHT JOIN Orders

ON Customers.Customer_ID = Orders.Customer_ID;
Result:
Name Order_ID Total_Amount

Alice 101 500

Bob 102 300

Alice 103 250

Charlie 104 150

4. FULL JOIN (or FULL OUTER JOIN):

The FULL JOIN returns all rows when there is a match in either left or right table. If there’s no match,
NULL values are returned for the columns that don’t have a match.

Code:

SELECT Customers.Name, Orders.Order_ID, Orders.Total_Amount

FROM Customers

FULL OUTER JOIN Orders

ON Customers.Customer_ID = Orders.Customer_ID;
Result:
Name Order_ID Total_Amount

Alice 101 500

Bob 102 300

Alice 103 250

Charlie 104 150

David NULL NULL

5. CROSS JOIN:
The CROSS JOIN returns the Cartesian product of the two tables, i.e., every row in the first table is
combined with every row in the second table. This results in a large number of rows.

SELECT Customers.Name, Orders.Order_ID

FROM Customers

CROSS JOIN Orders;


Result:
Name Order_ID

Alice 101

Alice 102

Alice 103

Alice 104

Bob 101

Bob 102

Unit 2: Transaction and Concurrency Control

5. Define ACID properties of a transaction with real-life examples


ACID properties refer to a set of four essential properties that ensure database transactions are processed
reliably. These properties are Atomicity, Consistency, Isolation, and Durability.

1. Atomicity: All-or-Nothing

Atomicity ensures that a transaction is treated as a single, indivisible unit of work. Either all operations
within the transaction are executed successfully, or none are. If any part of the transaction fails, the entire
transaction is rolled back, leaving the database in its previous state.

Real-Life Example: Bank Transfer

Imagine you are transferring $100 from your checking account to your savings account. This transaction
involves two steps:

1. Deduct $100 from the checking account.


2. Add $100 to the savings account.

If the system crashes after deducting the $100 but before adding it to the savings account, the transaction is
incomplete and inconsistent

2. Consistency: Maintaining Valid State

Consistency ensures that a transaction brings the database from one valid state to another, maintaining the
integrity and predefined rules of the database
Real-Life Example: E-Commerce Purchase

When a customer purchases an item from an online store, several consistency rules must be enforced:

 The item’s stock quan ty should decrease.


 The payment amount should match the total order price.
 The customer’s account should be updated correctly.

3. Isolation: Transactions Don’t Interfere with Each Other

Isolation ensures that multiple transactions occurring at the same time do not affect each other. Intermediate
states during a transaction are invisible to other transactions. This prevents conflicts caused by concurrent
transactions trying to access or modify the same data.

Real-Life Example: Simultaneous Ticket Bookings

Imagine two people are booking the last ticket for a concert at the same time:

1. Person A checks cket availability and sees one cket remaining.


2. Person B also checks and sees the same available cket.

4. Durability: Permanent Once Committed

Durability ensures that once a transaction is successfully committed, the changes are permanent and will
survive system crashes or power failures. The data is written to non-volatile storage, ensuring it remains
intact even if the system restarts.

Real-Life Example: Booking a Flight

After you book a flight and the transaction is completed (e.g., you get a confirmation email and a ticket),
durability ensures that even if the airline’s booking system crashes or undergoes maintenance, your
booking information is not lost.

6. What is a serializable schedule? Explain with an example.

A serializable schedule in database systems refers to a schedule (or sequence) of transac ons that ensures
the same result as if the transac ons were executed in some serial order (one a er the other), even though
they are executed concurrently. In other words, a serializable schedule produces the same outcome as a
serial schedule, where transac ons are run sequen ally without any overlap.
Concurrency control in databases allows mul ple transac ons to execute simultaneously, which improves
performance and resource u liza on. However, this concurrency can lead to issues such as lost updates,
dirty reads, or inconsistent data. Serializability ensures that even with concurrent execu on, the outcome
is equivalent to one that would be produced if the transac ons were executed one by one in some order.

Example of Serializable Schedule

Let's say we have two transactions, T1 and T2, operating on the same data item, X. Here are the actions of
each transaction:

 T1: Reads the value of X, modifies it, and writes it back.


 T2: Reads the value of X, modifies it, and writes it back.
7. Discuss timestamp-based protocols for concurrency control.

Timestamp-based protocols are a type of concurrency control mechanism used in database systems to
ensure that transactions execute in a way that guarantees serializability, preserving data consistency and
preventing conflicts between concurrent transactions. These protocols assign a unique timestamp to each
transaction and use this information to manage the order in which transactions can access and modify shared
data items.

 Timestamps:

 Each transaction is assigned a unique timestamp when it starts. This timestamp is used to determine
the order in which transactions should be executed.
 Timestamps can be based on the system clock or generated as sequential numbers to ensure
uniqueness.
 Two important timestamps are maintained for each data item:
o Read Timestamp (RTS): The timestamp of the most recent transaction that successfully read
the data item.
o Write Timestamp (WTS): The timestamp of the most recent transaction that successfully
wrote to the data item.

 Basic Idea:

 Transactions are ordered by their timestamps. A transaction with an earlier timestamp is considered
"older" and should be executed before a transaction with a later timestamp.
 The protocol ensures that transactions execute in a way that respects this logical ordering, preventing
any "younger" transaction from affecting the operations of an "older" one.

8. Write SQL queries to demonstrate transaction commit and rollback.

CREATE TABLE Accounts (


Account_ID INT PRIMARY KEY,
Account_Name VARCHAR(100),
Balance DECIMAL(10, 2)
);

INSERT INTO Accounts (Account_ID, Account_Name, Balance)


VALUES (1, 'Alice', 1000.00),
(2, 'Bob', 1500.00);
Commit:
BEGIN TRANSACTION;

UPDATE Accounts
SET Balance = Balance - 200
WHERE Account_ID = 1;

UPDATE Accounts
SET Balance = Balance + 200
WHERE Account_ID = 2;
COMMIT;

Rollback:

BEGIN TRANSACTION;
UPDATE Accounts
SET Balance = Balance - 300
WHERE Account_ID = 1;

UPDATE Accounts
SET Balance = Balance + 300
WHERE Account_ID = 2;

ROLLBACK;

9.Explain how deadlocks occur in databases. List two methods to avoid deadlocks.
A deadlock in a database occurs when two or more transactions are waiting for each other to release
locks on resources, but none of them can proceed. This creates a circular waiting situation where
each transaction is waiting for a resource that the other holds, leading to a state where none of the
transactions can complete. In other words, the transactions are stuck because each one is holding a
lock on a resource that the other needs.
Deadlocks occur in databases primarily due to:

 Mutual exclusion: Resources can only be locked by one transaction at a time.


 Hold and wait: Transactions hold resources while waiting for additional resources.
 No preemption: Transactions are not forced to release resources they are holding.
 Circular wait: A closed chain of transactions exists where each one is waiting for a resource held by
the next transaction in the chain.

Methods to Avoid Deadlocks

Several strategies can be employed to prevent or avoid deadlocks in database systems. Below are two
common methods:

1. Timeout-Based Deadlock Avoidance

This method avoids deadlocks by setting a timeout period for transactions. If a transaction is unable to
acquire the necessary locks within the specified timeout period, it is assumed to be in a deadlock situation
and is automatically aborted. The system then rolls back the transaction, releases the locks it holds, and
allows other transactions to proceed.

2. Deadlock Prevention using Ordering of Resources (Wait-Die and Wound-Wait


Schemes)

This method avoids deadlocks by imposing a total order on how transactions can acquire locks on
resources. The Wait-Die and Wound-Wait schemes are two variations of this approach, both relying on
transaction timestamps to break circular wait conditions.

Wait-Die Scheme:

 Older transac ons can wait for younger transac ons to release locks, but younger transac ons are aborted
if they need to wait for an older transac on.

Wound-Wait Scheme:

 Older transac ons can "wound" (abort) younger transac ons to acquire the locks they need, but younger
transac ons are forced to wait if they need to acquire resources held by older transac ons.

10.What are the states of a transaction? Describe each state briefly

1. Active State

 Description: This is the initial state of a transaction, where it is executing its operations (such as
reading or writing data). During this stage, the transaction is working with the database and acquiring
necessary resources like locks.

2. Partially Committed State

 Description: After the transaction has executed its final statement but before the changes have been
made permanent, the transaction enters the partially committed state. This means the system has
performed all operations but hasn’t yet committed them to the database.
3. Committed State

 Description: Once a transaction successfully completes and the changes made to the database are
permanently saved, it enters the committed state. At this point, the transaction has been
successfully completed, and its results are now visible to other transactions.

4. Failed State

 Description: A transaction enters the failed state if it encounters an error during its execution (e.g., a
constraint violation, system failure, or insufficient resources). In this state, the transaction can no
longer proceed, and it needs to be aborted or rolled back.

5. Aborted State

 Description: When a transaction cannot proceed due to a failure or if the transaction is rolled back, it
enters the aborted state. At this point, all the changes made by the transaction are undone, and the
database is restored to its previous consistent state.

6. Terminated State

 Description: A transaction enters the terminated state after it has either committed or aborted. At
this point, the transaction is finished, and no further operations will be performed.
4. Business Statistics (MT11)
Unit 1: Introduction to Business Statistics
1. Define statistics and explain its role in business decision-making.
Statistics is a branch of mathematics that involves the collection, analysis, interpretation,
presentation, and organization of data. It provides tools and methodologies to make sense of
numerical data, identify patterns, and draw meaningful conclusions. In essence, statistics is used to
understand variability in data and make informed decisions based on that understanding.
Role of Sta s cs in Business Decision-Making

1. Data-Driven Insights

Statistics help businesses collect and analyze data from various sources (e.g., customer surveys, sales
records, market research) to derive valuable insights. These insights guide businesses in understanding
customer behavior, identifying growth opportunities, and improving products or services.

2. Forecasting and Predictive Analysis

Statistical techniques such as regression analysis, time series analysis, and trend analysis are used to forecast
future business conditions like sales, demand, revenue, or market trends. This helps businesses prepare for
the future and align their strategies accordingly.

3. Quality Control

Statistics play a critical role in maintaining and improving the quality of products and services. Techniques
like Statistical Process Control (SPC) and Six Sigma use statistical tools to monitor and control
production processes, ensuring that they meet specified standards and reduce defects.

4. Risk Management

By analyzing historical data, businesses can identify potential risks and their impact. Statistical models such
as probability distributions and Monte Carlo simulations help businesses estimate the likelihood of risks and
develop strategies to mitigate them.

5. Marketing and Customer Insights

Statistics allow businesses to measure the effectiveness of their marketing campaigns by tracking metrics
such as conversion rates, customer retention, and return on investment (ROI). Additionally, statistical
analysis of customer data enables businesses to segment their markets and target specific customer groups
more effectively.

6. Operational Efficiency

Statistical analysis can optimize business operations by identifying areas for improvement. Techniques like
data mining and process analysis help businesses streamline their operations, reduce costs, and enhance
productivity.
7. Financial Analysis and Investment Decisions

In finance, statistics are used to analyze market trends, portfolio performance, and economic indicators.
Businesses rely on statistical models to assess investment risks, calculate the expected returns, and make
informed financial decisions.

8. Performance Measurement and Benchmarking

Statistics provide businesses with tools to measure their performance against industry benchmarks or
competitors. Key performance indicators (KPIs) such as sales growth, profit margins, or customer
satisfaction scores are often analyzed statistically to assess how well a company is doing.

2.Differentiate between qualitative and quantitative data with examples.

1. Qualitative Data

Qualitative data is non-numerical and typically represents characteristics, descriptions, or qualities that
can’t easily be measured with numbers. It deals with descriptions, observations, and words rather than
measurements or amounts.

Examples:

 Customer feedback: "The product is easy to use," or "I find the customer service to be very helpful."
 Gender: Male, female, non-binary.
 Colors: Red, blue, green.
 Types of cuisine: Italian, Mexican, Chinese.

2. Quantitative Data

Quantitative data is numerical and represents measurable quantities or amounts. It can be counted or
expressed in numbers, and it allows for mathematical operations and statistical analysis.

Examples:

 Survey results indicating the number of customers who prefer a product


 Data on sales figures, such as total sales in a quarter ($50,000 in Q1).

3.Explain the different types of scales of measurement.

1. Nominal Scale

 Definition: The nominal scale is the simplest form of measurement, used to categorize data without
any quantitative value or order.

2. Ordinal Scale

 Definition: The ordinal scale categorizes data into ordered categories, where the order matters but
the differences between categories are not quantifiable.

3. Interval Scale
 Definition: The interval scale is a numerical scale where the distance between values is meaningful,
but there is no true zero point.

4. Ratio Scale

 Definition: The ratio scale is the most informative scale, with all the properties of the interval scale,
plus a true zero point that indicates the absence of the quantity being measured.

4.What is the difference between descriptive and inferential statistics?

Descriptive Statistics

 Definition: Descriptive statistics involve summarizing and organizing data to describe its main
features. It provides a way to present and understand the basic characteristics of a dataset.
 Focuses on describing the data at hand.
 Utilizes measures such as mean, median, mode, standard deviation, and range.
 Often includes graphical representations like charts, graphs, and tables.

Inferential Statistics

 Definition: Inferential statistics involves making predictions or inferences about a population based
on a sample of data. It allows researchers to draw conclusions beyond the immediate data.
 Focuses on describing the data at hand.
 Utilizes measures such as mean, median, mode, standard deviation, and range.
 Often includes graphical representations like charts, graphs, and tables.

You might also like