EM4218E - Chapter 6
EM4218E - Chapter 6
EM4218E - Chapter 6
LEARNING OBJECTIVES
CONTENTS
1
8/05/2023
2
8/05/2023
6.2. DBMS
3
8/05/2023
10
11
12
4
8/05/2023
Designing database
• Entity Relationship Diagram (ERD)
• One-to-one
• One-to-many
• Many-to-many
• Data Normalization
13
Nonrelational Databases
• Nonrelational Database management systems use a more flexible data model
and are designed for managing large data sets across many distributed machines
and for easily scaling up or down.
• Useful for accelerating simple queries against large volumes of structured and
unstructured data (web, social media, graphics, and other forms of data)
Cloud Databases and Distributed Database
• Cloud-based data management services have special appeal for businesses
seeking database capabilities at a lower cost than in-house database products
MySQL, Microsoft Azure SQL Database, Oracle Database, PostgreSQL, Amazon Aurora, Maria
DB
• A distributed database is one that is stored in multiple physical locations
Google
14
Blockchain is a
distributed database
technology that enables
firms and organizations to
create and verify
transactions on a network
nearly instantaneously
without a central
authority.
15
5
8/05/2023
• Big Data
• Big Data are large and complex data sets. So large that traditional data
processing software is not capable of collecting, managing and processing data in
a reasonable amount of time. These large data sets can include 0structured,
unstructured, and semi-structured data.
• Big data is typically characterized by 3Vs:
• Volume: Extremely large data volume
• Variety: Various types of data
• Velocity: The speed at which data needs to be processed and analyzed
• Business Intelligence InfraStructure
• Analytical Tools
16
• Big Data
• Business Intelligence InfraStructure
• The technology infrastructure used to build and deploy Business Intelligence (BI) systems.
Business Intelligence Infrastructure provides tools and services to collect, manage, analyze and
display business information from different data sources, enabling users to make better decisions
based on information has been processed and analyzed
• Data Warehouses is a database that stores current and historical data of potential interest to
decision makers throughout the company
• Data Marts is a subset of a data warehouse in which a summarized or highly focused portion of
the organization’s data is placed in a separate database for a specific population of users
• Hadoop
• Hadoop is an open-source software framework managed by the Apache Software Foundation that enables
distributed parallel processing of huge amounts of data across inexpensive computers
• Hadoop consists of several key services, including the Hadoop Distributed File System (HDFS) for data
storage and MapReduce for high-performance parallel data processing
• In-Memory Computing
• Analytic Platform
• Analytical Tools
17
• Big Data
• Business Intelligence InfraStructure
• Data Warehouses and Data Marts
• Hadoop
• In-Memory Computing
• Another way of facilitating big data analysis is to use in-memory computing, which relies
primarily on a computer’s main memory (RAM) for data storage
• Analytic Platforms
• Commercial database vendors have developed specialized high-speed analytic platforms using
both relational and nonrelational technology that are optimized for analyzing large data sets.
• Analytic platforms feature preconfigured hardware-software systems that are specifically
designed for query processing and analytics
• Data lake is a repository for raw unstructured data or structured data that for the most part
has not yet been analyzed, and the data can be accessed in many ways
• Analytical Tools
18
6
8/05/2023
Technologies to Access
Information
19
20
21
7
8/05/2023
22
23
Some data points Data is incorrect or Data is recorded The same data is Data is no longer
are missing or not contains errors. differently across recorded relevant or up-to-
recorded. Having This happens due different sources or multiple times. date, that can
variety of reasons: to human error or does not match up Taking up lead to incorrect
data entry errors, technical glitches with other data in a unnecessary analysis and
data corruption or dataset storage space decision-making
missing and increase
information processing time
24
8
8/05/2023
25
Individual Assignment
Sylvester’s Bike Shop, located in San Francisco, California, sells road, mountain,
hybrid, leisure, and children’s bicycles. Currently, Sylvester’s purchases bikes from
three suppliers but plans to add new suppliers in the near future.
Your assigned to:
1. Build a simple relational database to manage information about Sylvester’s
suppliers and products.
2. Once you have built the database, perform the following activities.
• Prepare a report that identifies the five most expensive bicycles. The report should list the
bicycles in descending order from most expensive to least expensive, the quantity on hand for
each, and the markup percentage for each.
• Prepare a report that lists each supplier, its products, the quantities on hand, and associated
reorder levels. The report should be sorted alphabetically by supplier. For each supplier, the
products should be sorted alphabetically.
• Prepare a report listing only the bicycles that are low in stock and need to be reordered. The
report should provide supplier information for the items identified.
• Write a brief description of how the database could be enhanced to further improve
management of the business. What tables or fields should be added? What additional reports
would be useful?
26
EM4218E – Management Information System @ Assoc. Prof. Pham Thi Thanh Hong 27
27