DATABASE
DATABASE
DATABASE
GROUP 4:
2. Alausa Babatunde
CLIENT/SERVER SYSTEM
PRESENTED BY: ISSA HASSAN NDANA UIL/PG2022/0269
INTRODUCTION
• Storing Data and the Database
– Data is information in its simplest form, meaningless until related together in some
fashion so as to become meaningful.
– All data on computer is stored in one kind of database or another.
Query
QueryResult Client
Client running
running application
• Client/Server database computing can be defined as the logical partition of the user interface,
database management, and business; l ogi c between the client computer and server
computer.
• Business logic can be located on the server, on the client, or mixed between the two.
• Following are the reasons for its popularity.
– Affordability
– Speed
– Adaptability
– Simplified data access
CLIENT/SERVER DATABASE ARCHITECTURE
Various types of available Client/Server Database Architecture
1. Process-per-client architecture
Clients
Server
Process
Process
Database
Process
2. Multi-threaded architecture
Multi-threaded
Process
3. Hybrid architecture
Clients
Server
Process D ATABAS
Process
ProcessE
Process Process
Process
Client Front-end
API
Database Middleware
Database Translator
Network Translator
Network Protocol
DISTRIBUTED DBMS
1. The DBMS must provide distributed database transparency features like:
Distribution transparency
Transaction transparency
Failure transparency
Performance transparency
Heterogeneity transparency
2. Interaction between client and server might proceed as follows during the processing of an
SQL query:
The client passes a user query and decomposes it into a number of independent site
queries. Each site query is sent to the appropriate server site.
Each server process the local query and sends the resulting relation to the client site.
The client site combines the results of the subqueries to produce the result of the
originally submitted query.
3. In a typical DBMS, it is customary to divide the software module into three levels:
L1: The server software is responsible for local data management at site, much like
centralized DBMS software.
L2: The client software is responsible for most of the distributions; it access data
distribution information from the DBMS catalog and process all request that requires
access to more than one site. It also handles all user interfaces.
L3: The communication software provides the communication primitives that are used
by the client to transmit commands and data among the various sites as needed.
Host-based processing
Server-based processing
Client-based processing
Cooperative processing
CLIENT SERVICES
• Responsible for managing the user interface.
• Provides presentation services.
• Accepts and checks the syntax of user inputs. User input and final output, if any, are presented
at the client workstation.
• Acts as a consumer of services provided by one or more server processors.
SERVER SERVICES
• Some of the main operations that server perform are listed below:
– Accepts and processes database requests from client.
– Checks authorization.
– Ensure that integrity constraints are not violated.
– Performs query/update processing and transmits response to client.
– Maintains system catalog.
– Provide concurrent database access.
– Provides recovery control.
–
DATA WAREHOUSE
PRESENTED BY ; ALAUSA BABATUNDE UIL/PG2022/0096
Introduction to Data warehouse
A data warehouse is a large, centralized repository of integrated data from various sources
within an organization. A data warehouse is used to store historical and current data in a single
place, which is used for creating analytical reports and gaining business insights
Data warehouse is a type of data management system that is designed for efficient querying, repo
rting, and analysis of business data to support decision-making processes.
Data warehouses possess several key characteristics that distinguish them from other ty
pes of databases. These characteristics are crucial for supporting the analytical and repo
rting needs of organizations
1. Integrated Data:
Data warehouses integrate data from various sources within an organization. This integra
tion ensures that data is consistent, standardized, and can be easily compared and analy
zed.
2. Time-Variant:
Data warehouses store historical data, allowing users to analyze changes and trends over
time.
3. Non-Volatile:
Once data is loaded into a data warehouse, it is typically not updated or deleted. Instead,
changes are tracked through the addition of new records.
The purpose of a data warehouse It to provides decision-makers with the necessary informati
on and insights to make informed and strategic decisions.
5. Scalability:
Scalability ensures that the data warehouse can support increasing volumes of data as a
n organization grows
Databases are optimized for transactional processing and the efficient management of da
y-to-day operations, meanwhile data warehouses are designed to support analytical proce
ssing and decision-making by providing a consolidated and historical view of data. They
complement each other in an organization's data architecture, serving different but com
plementary purposes.
In general, there are key differences in terms of their design, functionality, and use cases.
below are the main differences between a database and a data warehouse:
1. Purpose
Database: Primarily designed for transactional processing and the efficient management
of day-to-day operations. It supports functions like data insertion, deletion, and modifica
tion.
Data Warehouse: Designed for analytical processing and to support decision-making pro
cesses. It focuses on providing a consolidated and historical view of data for reporting an
d analysis.
2. Data Structure:
Database: Typically follows a normalized data structure to minimize redundancy and maintain data
integrity. Emphasizes transactional consistency.
Data Warehouse: Often follows a denormalized or partially denormalized data structure optimized for
analytical querying. Emphasizes query performance and ease of analysis.
3. Data Volume:
4. Query Complexity:
Database: Often follows a normalized schema design to reduce redundancy and ensure consistency in
transactional data.
Data Warehouse: May use a star schema, snowflake schema, or other dimensional models to facilitate
efficient analytical querying.
5. Data Load and Refresh:
Database: Real-time or near-real-time data updates are common for transactional systems.
Data Warehouse: Typically involves periodic batch loading of data, with scheduled refreshes to
maintain historical records.
6. Performance Optimization:
Database: Optimized for transactional processing, with a focus on maintaining data consistency and
concurrency control.
Data Warehouse: Optimized for analytical processing, with a focus on query
The schema defines the structure of the data stored in the warehouse and plays a crucial
role in determining how easily and quickly users can query and analyze the data.
Data Warehouse may use a star schema, snowflake schema, a hybrid approach, or other dim
ensional models to facilitate efficient analytical querying.
Star Schema - A star schema is a type of data modeling technique used in data warehouse to represent
data in a structured way. In star schema, data is organized into a central fact table that is connected to
one or more dimension tables, forming a star-like structure.
A fact table: the central table in a start schema is known as the fact table and it contains quantitative
data ( also referred to as measures or metrics ) that are of interest to the user or organization.
The dimension table in a star schema contain the descriptive attributes to the quantitative data stored in
the fact table. These tables contain information that is used to filter, group and aggregate the
quantitative data in the fact table.
Introduction
Data mining is one of the most useful techniques that help entrepreneurs, researchers, and individuals
to extract valuable information from huge sets of data. Data mining is also called Knowledge
Discovery in Database (KDD). The knowledge discovery process includes Data cleaning, Data
integration, Data selection, Data transformation, Data mining, Pattern evaluation, and Knowledge
presentation.
This note on Data mining tutorial includes all topics of Data mining such as applications, Data mining
vs Machine learning, Data mining tools, Social Media Data mining, Data mining techniques,
Clustering in data mining, Challenges in Data mining, etc.
What is Data Mining?
The process of extracting information to identify patterns, trends, and useful data that would allow the
business to take the data-driven decision from huge sets of data is called Data Mining.
In other words, we can say that Data Mining is the process of investigating hidden patterns of
information to various perspectives for categorization into useful data, which is collected and
assembled in particular areas such as data warehouses, efficient analysis, data mining algorithm,
helping decision making and other data requirement to eventually cost-cutting and generating revenue.
Data mining is the act of automatically searching for large stores of information to find trends and
patterns that go beyond simple analysis procedures. Data mining utilizes complex mathematical
algorithms for data segments and evaluates the probability of future events. Data Mining is also called
Knowledge Discovery of Data (KDD).
Data Mining is a process used by organizations to extract specific data from huge databases to solve
business problems. It primarily turns raw data into useful information.
Data Mining is similar to Data Science carried out by a person, in a specific situation, on a particular
data set, with an objective. This process includes various types of services such as text mining, web
mining, audio and video mining, pictorial data mining, and social media mining. It is done through
software that is simple or highly specific. By outsourcing data mining, all the work can be done faster
with low operation costs. Specialized firms can also use new technologies to collect data that is
impossible to locate manually. There are tonnes of information available on various platforms, but
very little knowledge is accessible. The biggest challenge is to analyze the data to extract important
information that can be used to solve a problem or for company development. There are many
powerful instruments and techniques available to mine data and find better insight from it.
Relational Database
A relational database is a collection of multiple data sets formally organized by tables, records, and
columns from which data can be accessed in various ways without having to recognize the database
tables. Tables convey and share information, which facilitates data searchability, reporting, and
organization.
Data warehouses
A Data Warehouse is the technology that collects the data from various sources within the organization
to provide meaningful business insights. The huge amount of data comes from multiple places such as
Marketing and Finance. The extracted data is utilized for analytical purposes and helps in decision-
making for a business organization. The data warehouse is designed for the analysis of data rather than
transaction processing.
Data Repositories
The Data Repository generally refers to a destination for data storage. However, many IT professionals
utilize the term more clearly to refer to a specific kind of setup within an IT structure.
For example, a group of databases, where an organization has kept various kinds of information.
Object-Relational Database
A combination of an object-oriented database model and relational database model is called an object-
relational model. It supports Classes, Objects, Inheritance, etc.
One of the primary objectives of the Object-relational data model is to close the gap between the
Relational database and the object-oriented model practices frequently utilized in many programming
languages, for example, C++, Java, C#, and so on.
Transactional Database
A transactional database refers to a database management system (DBMS) that has the potential to
undo a database transaction if it is not performed appropriately. Even though this was a unique
capability a very long while back, today, most of the relational database systems support transactional
database activities.
Advantages of Data Mining
1. The Data Mining technique enables organizations to obtain knowledge-based data.
2. Data mining enables organizations to make lucrative modifications in operation and
production.
3. Compared with other statistical data applications, data mining is a cost-efficient.
4. Data Mining helps the decision-making process of an organization.
5. It Facilitates the automated discovery of hidden patterns as well as the prediction of
trends and behaviors.
6. It can be induced in the new system as well as the existing platforms.
7. It is a quick process that makes it easy for new users to analyze enormous amounts of
data in a short time.
Disadvantages of Data Mining
1. There is a probability that the organizations may sell useful data of customers to other
organizations for money. As per the report, American Express has sold credit card
purchases of their customers to other organizations.
2. Many data mining analytics software is difficult to operate and needs advance training
to work on.
3. Different data mining instruments operate in distinct ways due to the different
algorithms used in their design. Therefore, the selection of the right data mining tools is
a very challenging task.
4. The data mining techniques are not precise, so that it may lead to severe consequences
in certain conditions.
Applications of Data Mining
Data Mining is primarily used by organizations with intense consumer demands- Retail,
Communication, Financial, marketing company, determine price, consumer preferences, product
positioning, and impact on sales, customer satisfaction, and corporate profits. Data mining enables a
retailer to use point-of-sale records of customer purchases to develop products and promotions that
help the organization to attract the customer.
1. Business understanding
It focuses on understanding the project goals and requirements form a business point of view, then
converting this information into a data mining problem afterward a preliminary plan designed to
accomplish the target.
Tasks:
o Determine business objectives
o Access situation
o Determine data mining goals
o Reveal significant factors, at the starting, it can impact the result of the project.
Access situation
o It requires a more detailed analysis of facts about all the resources, constraints, assumptions,
and others that ought to be considered.
o The project plan should define the expected set of steps to be performed during the rest of the
project, including the latest technique and better selection of tools.
2. Data Understanding
Data understanding starts with an original data collection and proceeds with operations to get familiar
with the data, to data quality issues, to find better insight in data, or to detect interesting subsets for
concealed information hypothesis.
Tasks:
o Collects initial data
o Describe data
o Explore data
o If various information sources are acquired then integration is an extra issue, either here or at
the subsequent stage of data preparation.
Describe data
o It examines the "gross" or "surface" characteristics of the information obtained.
Explore data
o Addressing data mining issues that can be resolved by querying,
visualizing, and reporting, including:
o Distribution of important characteristics, results of simple aggregation.
o It may feed into the transformation and other necessary information preparation.
3. Data Preparation
o It usually takes more than 90 percent of the time.
o It covers all operations to build the final data set from the original raw information.
o Data preparation is probable to be done several times and not in any prescribed order.
Tasks
o Select data
o Clean data
o Construct data
o Integrate data
o Format data
Select data
o It decides which information to be used for evaluation.
o In the data selection criteria include significance to data mining objectives, quality and
technical limitations such as data volume boundaries or data types.
o It covers the selection of characteristics and the choice of the document in the table.
Clean data
o It may involve the selection of clean subsets of data, inserting appropriate defaults or more
ambitious methods, such as estimating missing information by modeling.
Construct data
o It comprises of Constructive information preparation, such as generating derived
characteristics,
o complete new documents, or transformed values of current characteristics.
Integrate data
o Integrate data refers to the methods whereby data is combined from various tables, or
documents to create new documents or values.
Format data
o Formatting data refer mainly to linguistic changes produced to information that does not alter
their significance but may require a modeling tool.
4. Modeling
In modeling, various modeling methods are selected and applied, and their parameters are measured to
optimum values. Some methods gave particular requirements on the form of data. Therefore, stepping
back to the data preparation phase is necessary.
Tasks
o Select modeling technique
o Access model
Build model
o To create one or more models, we need to run the modeling tool on the prepared data set.
Assess model
o It interprets the models according to its domain expertise, the data mining success criteria, and
the required design.
o It assesses the success of the application of modeling and discovers methods more technically.
o It Contacts business analytics and domain specialists later to discuss the outcomes of data
mining in the business context.
5. Evaluation
o At the last of this phase, a decision on the use of the data mining results should be reached.
o It evaluates the model efficiently, and review the steps executed to build the model and to
ensure that the business objectives are properly achieved.
o The main objective of the evaluation is to determine some significant business issue that has
not been regarded adequately.
o At the last of this phase, a decision on the use of the data mining outcomes should be reached.
Tasks
o Evaluate results
o Review process
o It tests the model on test apps in the actual implementation when time and budget limitations
permit and also assesses other data mining results produced.
o It unveils additional difficulties, suggestions, or information for future instructions.
Review process
o The review process does a more detailed evaluation of the data mining engagement to
determine when there is a significant factor or task that has been somehow ignored.
o It reviews quality assurance problems.
o It decides whether to complete the project and move on to deployment when necessary or
whether to initiate further iterations or set up new data-mining initiatives.it includes resources
analysis and budget that influence the decisions.
6. Deployment
Determine:
o Deployment refers to how the outcomes need to be utilized.
o Review project
Plan deployment:
o To deploy the data mining outcomes into the business, takes the assessment results and
concludes a strategy for deployment.
o It refers to documentation of the process for later deployment.
Review project
o Review projects evaluate what went right and what went wrong, what was done wrong, and
what needs to be improved.
1. Building up an
understanding of the
application domain
This is the initial
preliminary step. It develops the scene for understanding what should be done with the various
decisions like transformation, algorithms, representation, etc. The individuals who are in charge of a
KDD venture need to understand and characterize the objectives of the end-user and the environment
in which the knowledge discovery process will occur (involves relevant prior knowledge).
2. Choosing and creating a data set on which discovery will be performed
Once defined the objectives, the data that will be utilized for the knowledge discovery process should
be determined. This incorporates discovering what data is accessible, obtaining important data, and
afterward integrating all the data for knowledge discovery onto one set involves the qualities that will
be considered for the process. This process is important because of Data Mining learns and discovers
from the accessible data. This is the evidence base for building the models. If some significant
attributes are missing, at that point, then the entire study may be unsuccessful from this respect, the
more attributes are considered. On the other hand, to organize, collect, and operate advanced data
repositories is expensive, and there is an arrangement with the opportunity for best understanding the
phenomena. This arrangement refers to an aspect where the interactive and iterative aspect of the KDD
is taking place. This begins with the best available data sets and later expands and observes the impact
in terms of knowledge discovery and modeling.
3. Preprocessing and cleansing
In this step, data reliability is improved. It incorporates data clearing, for example, Handling the
missing quantities and removal of noise or outliers. It might include complex statistical techniques or
use a Data Mining algorithm in this context. For example, when one suspects that a specific attribute
of lacking reliability or has many missing data, at this point, this attribute could turn into the objective
of the Data Mining supervised algorithm. A prediction model for these attributes will be created, and
after that, missing data can be predicted. The expansion to which one pays attention to this level relies
upon numerous factors. Regardless, studying the aspects is significant and regularly revealing by
itself, to enterprise data frameworks.
4. Data Transformation
In this stage, the creation of appropriate data for Data Mining is prepared and developed. Techniques
here incorporate dimension reduction (for example, feature selection and extraction and record
sampling), also attribute transformation (for example, discretization of numerical attributes and
functional transformation). This step can be essential for the success of the entire KDD project, and it
is typically very project-specific. For example, in medical assessments, the quotient of attributes may
often be the most significant factor and not each one by itself. In business, we may need to think about
impacts beyond our control as well as efforts and transient issues. For example, studying the impact of
advertising accumulation. However, if we do not utilize the right transformation at the starting, then
we may acquire an amazing effect that insights to us about the transformation required in the next
iteration. Thus, the KDD process follows upon itself and prompts an understanding of the
transformation required.
5. Prediction and description
We are now prepared to decide on which kind of Data Mining to use, for example, classification,
regression, clustering, etc. This mainly relies on the KDD objectives, and also on the previous steps.
There are two significant objectives in Data Mining, the first one is a prediction, and the second one is
the description. Prediction is usually referred to as supervised Data Mining, while descriptive Data
Mining incorporates the unsupervised and visualization aspects of Data Mining. Most Data Mining
techniques depend on inductive learning, where a model is built explicitly or implicitly by generalizing
from an adequate number of preparing models. The fundamental assumption of the inductive approach
is that the prepared model applies to future cases. The technique also takes into account the level of
meta-learning for the specific set of accessible data.
6. Selecting the Data Mining algorithm
Having the technique, we now decide on the strategies. This stage incorporates choosing a particular
technique to be used for searching patterns that include multiple inducers. For example, considering
precision versus understandability, the previous is better with neural networks, while the latter is better
with decision trees. For each system of meta-learning, there are several possibilities of how it can be
succeeded. Meta-learning focuses on clarifying what causes a Data Mining algorithm to be fruitful or
not in a specific issue. Thus, this methodology attempts to understand the situation under which a Data
Mining algorithm is most suitable. Each algorithm has parameters and strategies of leaning, such as
ten folds cross-validation or another division for training and testing.
7. Utilizing the Data Mining Algorithm
At last, the implementation of the Data Mining algorithm is reached. In this stage, we may need to
utilize the algorithm several times until a satisfying outcome is obtained. For example, by turning the
algorithms control parameters, such as the minimum number of instances in a single leaf of a decision
tree.
8. Evaluation
In this step, we assess and interpret the mined patterns, rules, and reliability to the objective
characterized in the first step. Here we consider the preprocessing steps as for their impact on the Data
Mining algorithm results. For example, including a feature in step 4, and repeat from there. This step
focuses on the comprehensibility and utility of the induced model. In this step, the identified
knowledge is also recorded for further use. The last step is the use, and overall feedback and discovery
results acquire by Data Mining.
9. Using the discovered knowledge
Now, we are prepared to include the knowledge into another system for further activity. The
knowledge becomes effective in the sense that we may make changes to the system and measure the
impacts. The accomplishment of this step decides the effectiveness of the whole KDD process. There
are numerous challenges in this step, such as losing the "laboratory conditions" under which we have
worked. For example, the knowledge was discovered from a certain static depiction, it is usually a set
of data, but now the data becomes dynamic. Data structures may change certain quantities that become
unavailable, and the data domain might be modified, such as an attribute that may have a value that
was not expected previously.
Data Mining vs Machine Learning
Data Mining relates to extracting information from a large quantity of data. Data mining is a technique
of discovering different kinds of patterns that are inherited in the data set and which are precise, new,
and useful data. Data Mining is working as a subset of business analytics and similar to experimental
studies. Data Mining's origins are databases, statistics.
Machine learning includes an algorithm that automatically improves through data-based experience.
Machine learning is a way to find a new algorithm from experience. Machine learning includes the
study of an algorithm that can automatically extract the data. Machine learning utilizes data mining
techniques and another learning algorithm to construct models of what is happening behind certain
information so that it can predict future results.
Data Mining and Machine learning are areas that have been influenced by each other, although they
have many common things, yet they have different ends.
Data Mining is performed on certain data sets by humans to find interesting patterns between the items
in the data set. Data Mining uses techniques created by machine learning for predicting the results
while machine learning is the capability of the computer to learn from a minded data set.
Machine learning algorithms take the information that represents the relationship between items in
data sets and creates models in order to predict future results. These models are nothing more than
actions that will be taken by the machine to achieve a result.
What is Data Mining?
Data Mining is the method of extraction of data or previously unknown data patterns from huge sets of
data. Hence as the word suggests, we 'Mine for specific data' from the large data set. Data mining is
also called Knowledge Discovery Process, is a field of science that is used to determine the properties
of the datasets. Gregory Piatetsky-Shapiro founded the term "Knowledge Discovery in
Databases" (KDD) in 1989. The term "data mining" came in the database community in 1990. Huge
sets of data collected from data warehouses or complex datasets such as time series, spatial, etc. are
extracted in order to extract interesting correlations and patterns between the data items. For Machine
Learning algorithms, the output of the data mining algorithm is often used as input.
What is Machine learning?
Machine learning is related to the development and designing of a machine that can learn itself from a
specified set of data to obtain a desirable result without it being explicitly coded. Hence Machine
learning implies 'a machine which learns on its own. Arthur Samuel invented the term Machine
learning an American pioneer in the area of computer gaming and artificial intelligence in 1959. He
said that "it gives computers the ability to learn without being explicitly programmed."
Machine learning is a technique that creates complex algorithms for large data processing and provides
outcomes to its users. It utilizes complex programs that can learn through experience and make
predictions.
The algorithms are enhanced by themselves by frequent input of training data. The aim of machine
learning is to understand information and build models from data that can be understood and used by
humans.
Machine learning algorithms are divided into two types:
1. Unsupervised Learning
2. Supervised Learning
1. Unsupervised Machine Learning:
Unsupervised learning does not depend on trained data sets to predict the results, but it utilizes direct
techniques such as clustering and association in order to predict the results. Trained data sets are
defined as the input for which the output is known.
2. Supervised Machine Learning:
As the name implies, supervised learning refers to the presence of a supervisor as a teacher.
Supervised learning is a learning process in which we teach or train the machine using data which is
well leveled implies that some data is already marked with the correct responses. After that, the
machine is provided with the new sets of data so that the supervised learning algorithm analyzes the
training data and gives an accurate result from labeled data.
Major Difference between Data mining and Machine learning
1. Two-component is used to introduce data mining techniques first one is the database, and the second
one is machine learning. The database provides data management techniques, while machine learning
provides methods for data analysis. But to introduce machine learning methods, it used algorithms.
2. Data Mining utilizes more data to obtain helpful information, and that specific data will help to
predict some future results. For example, In a marketing company that utilizes last year's data to
predict the sale, but machine learning does not depend much on data. It uses algorithms. Many
transportation companies such as OLA, UBER machine learning techniques to calculate ETA
(Estimated Time of Arrival) for rides is based on this technique.
3. Data mining is not capable of self-learning. It follows the guidelines that are predefined. It will
provide the answer to a specific problem, but machine learning algorithms are self-defined and can
alter their rules according to the situation, and find out the solution for a specific problem and resolves
it in its way.
4. The main and most important difference between data mining and machine learning is that without
the involvement of humans, data mining can't work, but in the case of machine learning human effort
only involves at the time when the algorithm is defined after that it will conclude everything on its
own. Once it implemented, we can use it forever, but this is not possible in the case of data mining.
5. As machine learning is an automated process, the result produces by machine learning will be more
precise as compared to data mining.
6. Data mining utilizes the database, data warehouse server, data mining engine, and pattern
assessment techniques to obtain useful information, whereas machine learning utilizes neural
networks, predictive models, and automated algorithms to make the decisions.
Data Mining Vs Machine Learning
Origin Traditional databases with unstructured It has an existing algorithm and data.
data.
Meaning Extracting information from a huge Introduce new Information from data
amount of data. as well as previous experience.
History In 1930, it was known as knowledge The first program, i.e., Samuel's
discovery in databases(KDD). checker playing program, was
established in 1950.
Responsibility Data Mining is used to obtain the rules Machine learning teaches the
from the existing data. computer, how to learn and
comprehend the rules.
Abstraction Data mining abstract from the data Machine learning reads machine.
warehouse.
Techniques Data mining is more of research using It is a self-learned and train system to
involve a technique like a machine learning. do the task precisely.
Scope Applied in the limited fields. It can be used in a vast area.
DATABASE IN E-COMMERCE
PRESENTED BY; OLAYIWOLA FOLORUNSO OLAYIWOLA UIL/PG2022/0349
1.0 INTRODUCTION
As technology continues to evolve, it made it clear that e-commerce continues to evolve and help
businesses as well. Before the internet and technology were invented, customers physically entered
physical shops to purchase goods and request services. Today, customers can purchase products online
in their own homes and even obtain overseas products. In 2020, there’s been estimated to be 12-24
million e-commerce globally. E-Commerce (Electronic Commerce) itself is a service that is used to
carry out a transaction, in other words,E-commerce refer to a container where a website provides space
to carry out various forms of online transactions in the form of trading activities and making
purchases; on the one hand, it also utilizes internet facilities as an important role in following up on the
process. The development of technology is increasingly sticking a high impact on the systematic
performance of e-commerce to provide opportunities for the formation of good communication and
interaction of one person or with others around the world to offer an efficient impact on various
objects. In E-commerce, of course, it is impossible if apart from the use of databases, where the
database has a role as a container to store all forms of data related to transactions carried out on the
website in e-commerce.Processing various database forms results in accelerating the acquisition of
information that can improve service to customers. Because the workings of a database are to do a
similar record consisting of many interconnected fields, it deals with e-commerce that performs many
types of activities, such as sales, purchasing planning, exchange of businesspeople from one
businessperson to another, and the internal processes that companies use to support the running of the
business. Therefore, using databases in e-commerce can make it easier to access and store all forms of
data information contained in e-commerce. In addition, this is also to provide a fortress for all forms of
operational activities so that parties from e-commerce can act swiftly and appropriately in dealing with
problems. The database uses tables, rows, and columns, similar to spreadsheets, to organize and
retrieve data.
1. User Management
This structure explains that the user table contains all user details and user payment to store user
payment details. This structure provides more granular data control.
2. Product Management
This structure explains that the other two separate tables are the discount table, product inventory, and
product category, which are connected to them through database relationships; this approach provides
the greatest level of flexibility to the database system applied.
3. Shopping Process
The reason why many e-commerce uses databases is that database provide many benefits. The few
benefits of using a database in e-commerce include the following:
Avoid Human Error
Analyze data in various
More efficient in carrying out inventory
Help users have easier access to the required
It can help businesses protect data while only authorized users can access it.
It can help businesses restore and backup data that has been damaged or contains an error
Help pinpoint potential customers and business
Help businesses evolve and adapt to the marketing environment.
Responsibilities
Installation, configuration and upgrading of Database server software and related products.
Evaluate Database features and Database related products.
Establish and maintain sound backup and recovery policies and procedures.
Take care of the Database design and implementation.
Implement and maintain database security (create and maintain users and roles, assign
privileges).
Database tuning and performance monitoring.
Application tuning and performance monitoring.
Setup and maintain documentation and standards.
Plan growth and changes (capacity planning).
Work as part of a team and provide 24/7 support when required.
Do general technical troubleshooting and give cons.
Database recovery
Types
There are three types of DBAs:
1. Systems DBAs (also referred to as physical DBAs, operations DBAs or production Support
DBAs): focus on the physical aspects of database administration such as DBMS installation,
configuration, patching, upgrades, backups, restores, refreshes, performance optimization,
maintenance and disaster recovery.
2. Development DBAs: focus on the logical and development aspects of database administration
such as data model design and maintenance, DDL (data definition language) generation, SQL
writing and tuning, coding stored procedures, collaborating with developers to help choose the
most appropriate DBMS feature/functionality and other pre-production activities.
3. Application DBAs: usually found in organizations that have purchased 3rd party application
software such as ERP (enterprise resource planning) and CRM (customer relationship
management) systems. Examples of such application software includes Oracle Applications,
Siebel and PeopleSoft (both now part of Oracle Corp.) and SAP. Application DBAs straddle
the fence between the DBMS and the application software and are responsible for ensuring that
the application is fully optimized for the database and vice versa. They usually manage all the
application components that interact with the database and carry out activities such as
application installation and patching, application upgrades,database cloning, building and
running data cleanup routines, data load process management, etc.
Database plays a critical role in web app development. It is one of the most important aspects of building an
application. It is necessary that you have a piece of good knowledge of databases before using them in your
application. Database design plays a key role in the operation of your website and provides you with
information regarding transactions, data integrity, and security issues. In this article, you will learn the role of
databases in web application development. You will also learn about the most popular web app databases and
how to connect databases to the web applications.
n the early days of computing, databases were synonymous with files on disk. The term is still
commonly used this way for example when people refer to their hard drive as their "main database".
Data is the foundation of a web application. It is used to store user information, session data, and other
application data. The database is the central repository for all of this data. Web applications use a
variety of databases to store data such as flat files, relational databases, object-relational databases, and
NoSQL databases. Each type of database has its own advantages and disadvantages when it comes to
storing and retrieving data.
A database is a collection of data and information that is stored in an organized manner for easy
retrieval. The primary purpose of a database is to store, retrieve, and update information. A database
can be used to store data related to any aspect of business operations.
Databases can be very large, containing millions of records, or very small, containing just a few
records or even a single record. They may be stored on hard disks or other media, or they may exist
only in memory. In the early days of computing, databases were stored on tape drives or punch cards.
Today they're stored on hard drives, flash memory cards, and other media.
Databases are designed to ensure that the data they contain is organized and easily retrievable. A
database management system (DBMS) is the software used to create and maintain a database.
The role of databases in a web application is very important. The web application interacts with the
database to store data and retrieve data from it. The database is used to store all the information that
the user needs to store. For example, if you are developing a shopping cart website then it will contain
product details, customer details, order details, etc. In this case, you need to store this information in a
database so that we can use them later on.
Why Do Web App Developers Need a Database?
The first thing one should know when it comes to databases is the need. There are huge numbers of
businesses out there, whose revenue depends on the success and future of their database. You see, a
database is extremely important for online companies and businesses as well. These days databases are
used for various purposes like managing financial records, setting up customer profiles, keeping
inventory and ordering information, etc. But what does all this mean?
Most modern web applications are based on a database. The database stores information about the
users, products, orders, and more. A database is an important component of any web application
because it provides a central location for storing user information and business logic. In addition to
this, it allows you to store complex data structures with minimal effort.
Databases are used by businesses to collect and store customer information, financial records, and
inventory data. They're also used in research projects to store information about experiments or tests.
For example, if you were conducting a survey on the habits of people who eat cereal for breakfast, you
might use a database to keep track of your results.
Databases are also used by government agencies to store public records like birth certificates and
marriage licenses. Databases are also used by medical researchers who need to record the medical
history of patients in order to determine how effective certain treatments may be for different diseases
or conditions.
Web applications are becoming more and more popular because they allow users to access information
from different devices at the same time. A web application database offers benefits such as:
Security
A web application database provides security features such as encryption and password protection. If a
user’s password becomes lost or compromised, it will not be possible for someone else to access the
information stored in the database.
Accessibility
Users can access their data from any internet-enabled device, which includes smartphones and tablets
as well as laptops and desktops. This means that users do not have to worry about losing their valuable
data because it is stored on another device.
Web applications are usually accessed by many users simultaneously, unlike traditional desktop
applications that are accessed by one person at a time, so web apps need to be able to handle more
requests simultaneously than their desktop counterparts. Web application databases use distributed
architecture (multiple servers) to scale up quickly when demand increases, so they can handle large
numbers of simultaneous requests without slowing down or crashing.
Ease of maintenance for IT staff
Because web application databases use distributed architecture, problems can be isolated and fixed
quickly, which reduces downtime for the end user and reduces costs for IT staffs responsible for
maintaining the system. Also, with database automation tools we can make database tasks easier and
safer.
Relational
A database is a large collection of structured data, which can be accessed to find specific information.
Relational databases are famous for their structure and have been used by programmers for years.
MySQL (Relational)
MySQL is a relational database management system (RDBMS) based on SQL. It is a popular database
server, and a multi-user, multi-threaded SQL database. MySQL is developed by Oracle Corporation.
The name "MySQL" is a play on the name of co-founder Michael Widenius's earlier project, Monty
Python's Flying Circus. It is written in C and C++ programming languages, with some elements
written in Java. It has been licensed under GPLv2 since 2004, but it can be used under the terms of the
GNU Affero General Public License.
MySQL database is often used for data storage, especially in web applications, and it is also widely
used for creating and maintaining relational database tables. MySQL is owned by Oracle Corporation
and was developed by a Swedish company called MySQL AB, which was bought by Sun
Microsystems in 2008. As of 2009, the project is managed by Oracle Corporation.
It has become the most popular open source and best database software in the world, used on the web
and mobile applications, by corporations large and small and across all industries.
PostgreSQL (Relational)
An object-relational database management system that supports SQL-based queries, similar to those
used by other RDBMS systems such as MySQL or Oracle Database. PostgreSQL is developed and
maintained by PostgreSQL Global Development Group, which is made up of several companies and
individuals who have contributed code to the project over time.
PostgreSQL's developers do not require contributors to sign a Contributor License Agreement (CLA).
The PostgreSQL license includes a clause requiring attribution of original authorship if it's not done
automatically by the contributor's revision control system.
The software is distributed under an ISC license, which allows anyone to use it for any purpose
without paying royalties or fees.
MongoDB (Non-Relational)
MongoDB's development began in 2007 when its creators were working on software for the social
media website Facebook.com. They attempted to create a new kind of database that would be better
suited to the needs of web applications than traditional relational databases, but they found that
commercial offerings did not meet their requirements. As a result, they developed a prototype called
GridFS before founding 10gen to continue work on it as a product named MongoDB. In 2009, the
company changed its name to MongoDB Inc., and in February 2010 it released the first production
version of MongoDB.
Cassandra (Non-Relational)
Cassandra is an open-source database management system that runs on many servers, making it well-
suited for handling large amounts of data. It offers fast performance and can scale up to a petabyte of
data across multiple servers, making it useful for applications with high write-throughput
requirements.
Cassandra is built on the principles of Dynamo with the goal of addressing some of its problems. The
technology was developed at Facebook and released as an Apache Incubator project in 2009. It
graduated from incubation in June 2010 and became an Apache Top-level Project (TLP) in January
2012.
Cassandra's architecture is based on Dynamo, but differs from it significantly in its design details,
especially regarding consistency guarantees and failure detection mechanisms. In particular, Cassandra
does not provide strong consistency; instead, it aims to provide high availability by making it easy to
deploy multiple copies of the data across many hosts while tolerating failures at any one host. This
makes Cassandra a popular choice for internet startups that must scale quickly and cheaply.
Cassandra is a key-value store, but it has flexible data models, so you can use it to store virtually any
kind of data. You can also use Cassandra for full-text search, or even for storing graph data (although
there are better options for graph storage than Cassandra).
Neo4j is an open-source graph database management system that stores data in a native graph database
format. It's designed to store data and query it very quickly, making it ideal for applications that
involve complex relationships between entities. It uses the native graph data model to provide ACID
transactions, high availability, and indexing. It's used by many companies to power their critical
applications, including eBay and Walmart.
Unlike relational databases, Neo4j doesn't enforce a schema on your data. This makes it easier to build
applications that model real-world problems such as social networks or product recommendations.
You can create multiple nodes for the same entity without duplicating data or having to use foreign
keys. In addition, Neo4j allows you to add properties to existing nodes without having to create a new
table first. These features make Neo4j much more agile than traditional relational databases when
modeling complex relationships between entities with many attributes and relationships between them.
MariaDB (Relational)
MariaDB is a fork of the MySQL relational database management system intended to remain free
under the GNU GPL. MariaDB was forked in 2009 by some of the original developers of MySQL
when Oracle announced that it would no longer fully support the community-developed version of
MySQL in favor of a paid enterprise product.
The original developers of MySQL created MariaDB to provide a better development environment and
more robust performance. MariaDB strives to be compatible with MySQL and includes most of its
storage engines. However, not all features are supported in MariaDB Server so it is recommended that
you check for compatibility before using any feature that may be affected by a bug or limitation in
MariaDB Server.
MSSQL (Relational)
MSSQL databases are the core of Microsoft SQL Server. It is a relational database management
system (RDBMS), a special type of database software that is used to create, store and manipulate data
in an organized manner.
MSSQL can be used to build enterprise-level business solutions and applications. Regardless of the
platform or device your users are using, you can use MSSQL to create a centralized data store with a
single version of the truth. You can also use it to create a single source of truth for your data analytics
and reporting technologies, such as Power BI and Tableau.
Conclusion
The database is an integral part of any Web application or website. Whether it is used for storing data
in an easy-to-access manner or for maintenance, the database is going to play a role in the success of
your project and you can't overlook it. For those who are simply going to be accessing data, the
strength of the database will not matter much as long as it has all the functionality they need.
However, those who plan on using it or maintaining it should really explore why one database type
may work better than another. If a web app is going to run fast and efficiently with minimal downtime,
every consideration needs to be made so that bottlenecks do not occur. The success of your project
may depend on your choice of database.
Python. Python is the most popular open-source, back-end web development language in 2023.
...
PHP. PHP is an open-source scripting language. ...
Java. Java is an object-oriented, platform-independent, and secured programming language. ...
C# ...
Ruby. ...
Swift. ...
Kotlin.