Asm Final Bps Phamletruong Bh01358

Download as pdf or txt
Download as pdf or txt
You are on page 1of 82

ASSIGNMENT FINAL REPORT

Qualification BTEC Level 5 HND Diploma in Computing

Unit number and title Unit 5: Business Process Support

Submission date 09/08/2024 Date Received 1st submission

Re-submission Date Date Received 2nd submission

Student Name PHAM LE TRUONG Student ID BH01358

Class SE06304 Assessor name VU ANH TU

Plagiarism
Plagiarism is a particular form of cheating. Plagiarism must be avoided at all costs and students who break the rules, however innocently, may be
penalised. It is your responsibility to ensure that you understand correct referencing practices. As a university level student, you are expected to use
appropriate references throughout and keep carefully detailed notes of all your sources of materials for material you have used in your work,
including any material downloaded from the Internet. Please consult the relevant unit lecturer or your course tutor if you need any further advice.

Student Declaration
I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism. I declare that the work
submitted for assessment has been carried out without assistance other than that which is acceptable according to the rules of the specification. I
certify I have clearly referenced any sources and any artificial intelligence (AI) tools used in the work. I understand that making a false declaration is
a form of malpractice.
Student’s signature

Grading grid

P1 P2 P3 P4 P5 P6 P7 P8 M1 M2 M3 M4 M5 D1 D2 D3
 Summative Feedback:  Resubmission Feedback:

Grade: Assessor Signature: Date:


Internal Verifier’s Comments:

Signature & Date:


Contents
I. INTRODUCTION ................................................................................................................................................... 7
II. CONTENT.............................................................................................................................................................. 8
2.1. Discuss how data and information support business processes and the value they have for
organizations. ........................................................................................................................................................... 8
2.1.1. What is business data? ...................................................................................................................... 8
2.1.2. What is information in Businesses? .................................................................................................. 9
2.1.3. Types of used in business?................................................................................................................. 9
2.1.4. The value of data and information to the organization ................................................................. 10
2.2. Discuss how data is generated and the tools users to manipulate it to form meaningful data support
business operations ............................................................................................................................................... 11
2.2.1. Sources create data within the organization .................................................................................. 11
2.2.2. Methods of generated and usage data ........................................................................................... 13
2.2.3. Tools and techniques are used to analyze and transform data into meaningful information ..... 15
2.3. Assess the value of data and information to individuals and organizations in relation to real-world
business processes ................................................................................................................................................. 20
2.4. Discuss the social legal and ethical implications of using data and information to support business
processes ................................................................................................................................................................ 28
2.4.1. Privacy and data protection concerns ............................................................................................. 28
2.4.2. Ethical issues in data collection and use ......................................................................................... 30
2.4.3. Legal compliance requirements of using data and information to support business processes .. 32
2.5. Describe common threats to data and how they can be mitigates at on a personal and organizational
level 35
2.5.1. Internal and external threats to data security ................................................................................ 35
2.5.2. Mitigation Strategies ....................................................................................................................... 38
2.5.3. Business continuity and recovery plan after the model ................................................................. 38
2.6. Analyse the impact of using data and information to support business realworld business processes 39
2.6.1. Improve operational efficiency and productivity ........................................................................... 40
2.6.2. Challenges and limitations in implementing data-driven strategies ............................................. 40
2.6.3. Recommendations to enhance the value of data and information ............................................... 41
2.7. Discuss how tools and technologies associated with data science are used to support business
processes and inform decisions ............................................................................................................................. 41
2.7.1. How Data Analytics Supports Business Processes and Decision Making ....................................... 41
2.7.2. Popular data science tools ............................................................................................................... 43
2.7.3. The tools help collect and synthesize data ..................................................................................... 48
2.7.4. Data cleaning tools and solutions ................................................................................................... 49
2.8. Assess the benefits of using data science to solve problems in real-world scenarios ........................... 50
2.8.1. Tool and technology benefits of ABC Manufacturing project ........................................................ 50
2.8.2. Assess the benefits of using data science to solve problems ......................................................... 51
2.9. Design a data science solution to support decision-making related to a real-world problem .............. 53
2.9.1. Problems encountered by ABC Manufacturing when collecting data ........................................... 53
2.9.2. Data science solutions to support decision making........................................................................ 55
2.9.3. Apply data science tools to solve problems encountered when collecting data for ABC
Manufacturing .................................................................................................................................................... 56
2.9.4. Image of the overall architectural design ....................................................................................... 58
2.10. Implement a data science solution to support decision making related to a real-world problem ... 62
2.10.1. What is data collection? .............................................................................................................. 62
2.10.2. ABC Manufacturing’s data collection method ............................................................................ 63
2.10.3. Data cleaning and preprocessing ................................................................................................ 64
2.10.4. Apply data science solutions to support decision making ......................................................... 72
2.11. Make justified recommendations that support decision making related to a real-world problem . 76
2.11.1. Most common data science techniques ..................................................................................... 76
2.11.2. Use data science techniques to make recommendations .......................................................... 77
III. EVALUATE ...................................................................................................................................................... 79
IV. CONCLUSION ................................................................................................................................................. 80
V. REFERENCES................................................................................................................................................... 81
Table of Figure
Figure 1: business data ................................................................................................................................................. 8
Figure 2: Excel ............................................................................................................................................................. 16
Figure 3: Power BI ....................................................................................................................................................... 17
Figure 4: data analysis techniques .............................................................................................................................. 20
Figure 5: Power BI Dash Board (1) .............................................................................................................................. 20
Figure 6: Power BI Dash board(2) ............................................................................................................................... 21
Figure 7: Pie chart ....................................................................................................................................................... 24
Figure 8: Calculate the top N and bottom N stores by revenue ................................................................................. 24
Figure 9: Show the state locations on a map and visualize the profit performance using bubble size ..................... 25
Figure 10: Calculate the profit of each product .......................................................................................................... 25
Figure 11: area chart ................................................................................................................................................... 26
Figure 12: data security, privacy ................................................................................................................................. 29
Figure 13: GDPR .......................................................................................................................................................... 32
Figure 14: CCPA ........................................................................................................................................................... 33
Figure 15: External Threats ......................................................................................................................................... 36
Figure 16: data driven decision making ...................................................................................................................... 41
Figure 17: Python ........................................................................................................................................................ 43
Figure 18: SQL ............................................................................................................................................................. 44
Figure 19: Power BI ..................................................................................................................................................... 45
Figure 20: Pandas ........................................................................................................................................................ 46
Figure 21: Jupyter ....................................................................................................................................................... 47
Figure 22: example of an overall system design architecture .................................................................................... 59
Figure 23: data collection ........................................................................................................................................... 62
Figure 24: python code (1) .......................................................................................................................................... 64
Figure 25: data ............................................................................................................................................................ 65
Figure 26: python code (2) .......................................................................................................................................... 65
Figure 27: result .......................................................................................................................................................... 66
Figure 28: result duplicate rows ................................................................................................................................. 66
Figure 29: code python (3) .......................................................................................................................................... 67
Figure 30: code and result change data type ............................................................................................................. 68
Figure 31:code and result replace data ...................................................................................................................... 69
Figure 32: code and result merge data ....................................................................................................................... 70
Figure 33: code and result create new columns ......................................................................................................... 71
Figure 34: result line chart .......................................................................................................................................... 72
Figure 35: code python show total sales by month .................................................................................................... 72
Figure 36: code python shows sale for the top 5 cities .............................................................................................. 73
Figure 37: result bar chart .......................................................................................................................................... 73
Figure 38 : code and bar chart of top 5 STIDs with highest sales ............................................................................... 74
Figure 39: code and bar chart of top 5 STIDs with lowest sales ................................................................................. 75
I. INTRODUCTION

In today’s highly competitive manufacturing landscape, data and information have become the lifeblood
of organizations looking to optimize operations, drive innovation, and stay ahead. ABC Manufacturing, a
leading manufacturer of high-quality industrial components, recognizes the immense potential of data
science to transform its business processes and enhance its competitive advantage.

As a modern manufacturing facility, ABC generates large amounts of data from a variety of sources,
including production line sensors, inventory management systems, customer relationship databases, and
financial records. By leveraging the power of data science, the organization aims to extract meaningful
insights from this data to support critical decision-making, improve operational efficiency, and deliver
superior value to customers.

This report outlines the ways data and intelligence can enhance core business processes, such as supply
chain optimization, inventory management, and customer relationship management. Explore the tools
and techniques used to collect, store, and analyze data, enabling ABC to make data-driven decisions that
drive continuous improvement. As ABC Manufacturing embarks on its data science journey, it must
prioritize robust data privacy and security measures to protect the sensitive information of its employees,
customers, and stakeholders.

ABC Manufacturing can strategically integrate data science into its operations, opening up new
opportunities for growth, efficiency, and innovation.
II. CONTENT
2.1. Discuss how data and information support business processes and the value they have for
organizations.
2.1.1. What is business data?

Business data is information collected, stored and processed by businesses to support business decision making. It
can include many different types of information, including:

• Transaction data: Sales revenue, costs, number of customers, etc.

• Customer data: Personal information, preferences, shopping behavior, etc.

• Market data: Market trends, competitors, SWOT analysis, etc.

• Operational data: Operational performance, process efficiency, etc.

• Employee data: Salary, productivity, review data, etc.

Figure 1: business data

Many platforms, including point-of-sale (POS) systems, websites, social media, IoT devices, etc., can be used to
collect business data. Once collected, the data is analyzed using data analysis tools and stored in a database.

Any size organization may benefit greatly from having access to business data. Businesses may boost profitability,
acquire a competitive edge, and enhance performance by leveraging corporate data efficiently.
2.1.2. What is information in Businesses?

Data, statistics, reports, events and any other type of information that can help businesses better
understand their industry, customers, competitors and internal processes are called information business.

The following activities are served by this information, which is important in helping companies make
informed decisions:

• Strategic planning: Information about the market, competitors, and customer trends helps
businesses build appropriate business strategies and development directions.

• Decision making: Information about internal operations, production efficiency, financial situation
helps business leaders make wise decisions in investment, production, marketing, etc.

• Improve operational efficiency: Information about production processes and customer services
helps businesses identify weaknesses and strengths, thereby taking measures to improve
operational efficiency.

• Enhance competitive advantage: Information about the market and competitors helps businesses
develop products and services in accordance with market needs, creating a competitive advantage
over competitors.

Information is crucial in assisting firms in gaining a competitive edge and making wise decisions. To get the
most out of information in company operations, companies must create an efficient information
management system.

2.1.3. Types of used in business?

The types of data used in business are diverse, including:

• Transaction data: Sales revenue, costs, number of customers, product price, conversion rate,
average order value, profit

• Customer data: Personal information (full name, age, gender, address, email, phone number),
preferences and shopping behavior, purchase history, customer feedback, customer satisfaction
level
• Market data: Market trends, market size, market percentage, competitors, SWOT analysis, market
price, customer demand

• Operational data: Operational efficiency, process efficiency, labor productivity, job completion
time, error rate, resource usage

• Employee data: Salary, productivity, review data, engagement, turnover, skills and experience

• Data from other sources: Social networks, websites, IoT devices, sensor systems, weather data,
economic data

Various kinds of data may be used by businesses for various objectives. For instance, market data may be
used to find new company prospects, transaction data can be used to track sales success, and customer
data can be utilized to enhance marketing efforts.

2.1.4. The value of data and information to the organization

Informed decision-making: Information and data provide companies the foundation they need to make
judgments based on facts rather than conjecture or gut feeling. Organizations may benefit from increased
performance, lower risk, and a competitive edge in this way.

Boost operational efficiency: Information and data may be utilized to pinpoint areas where a business can
be more productive, save expenses, or reduce waste.

Development of new products and services: Information on consumer demands, industry trends, and
emerging technology may be utilized to create new offerings that cater to consumer demands.

Boost customer happiness: By using consumer data to better understand their requirements and desires,
businesses may offer better customer service and raise customer satisfaction.

Risk management: Information concerning outside variables that might have an impact on the company,
such market trends, governmental laws, and economic situations, can be utilized to assess risk.

Improved compliance: Information may be utilized to make sure that the company complies with relevant
legal requirements.
Enhance competitive advantage: Organizations that use data and information effectively can gain a
competitive advantage over competitors that do not use data. This is because these organizations can
make more informed decisions, develop new products and services faster, and respond to customer needs
better.

2.2. Discuss how data is generated and the tools users to manipulate it to form meaningful data
support business operations
2.2.1. Sources create data within the organization

An organization's data comes from a variety of sources, including everyday operations and exchanges with
the outside world. Here are a few particular instances:

➢ Internal activities:

Sales activities:

• Sales transaction: The point of sale (POS) or invoicing system documents the product, quantity,
price, mode of payment, customer details, and other relevant information when a customer buys
a good or service.

• Customer service activities: Data regarding requests, comments, complaints, and other customer
information will be created through phone, email, text message, and other interactions between
sales representatives and customers.

Production tasks:

• Production data: Production lines and machinery systems will keep track of information on output,
time spent producing, raw materials used, etc.

• Data on quality control: Activities involving the inspection and assessment of product quality will
produce information on mistake rates, error causes,...
Financial activities:

• Financial transactions: Data on income, expenses, profits, cash flow, and other related topics will
be recorded by the accounting system.

• Payment activities: Data will be created on invoices and documents as a result of payments made
with suppliers, customers, etc.

Human resources activities:

• Employee profile: The human resources management system stores the personal data, educational
background, employment history, pay, and other details of employees.

• Timekeeping information: The timekeeping system will maintain track of information on workers'
working hours, absences, and attendance.

➢ Outside activities:

Website and application:

• Interaction data: Information regarding a user's actions, such as the sites they view, the goods they
click on, how long they spend on a website, is captured as they use an application.

• User reviews, comments, and feedback on the application or website are also significant sources
of information.

Social network:

• Social media interaction: Information on brand awareness, consumer interest level, and other
social media metrics may be obtained from posts, comments, and shares pertaining to the brand.

• Advertising: Information on how well social media advertising initiatives are working is another
helpful data source.
Business Partners:

• Transaction data: The partner relationship management system (CRM) contains information on the
purchases and sales of products and services made by the company and its partners.

• Collaboration data: Information shared between partners and organizations, as well as statistics on
collaborative initiatives, are significant sources of information.

Government agencies:

• Revenue, profit, and tax information is reported to the appropriate tax authorities.

• Statistics: Analytical uses can also be made of the market and demographic data that statistics
organizations supply.

Furthermore, a mix of internal and external sources may also create data. For instance, to produce a more
thorough study of consumer purchasing behavior, sales performance data and customer demographic
data can be merged.

2.2.2. Methods of generated and usage data

Generated data

Data Type How to create


Transaction data - Record sales information (products, prices, etc.)
- Use POS system
- Connect to online sales channels
Customer Data - Collect personal information when customers
register
- Analyze shopping behavior
- Use CRM program
Marketing data - Monitor campaign effectiveness
- Use marketing analysis tools
- Survey customers
Social network & web data - Monitor interactions on social networks
- Analyze website traffic
Listen to online conversations
Analytical data - Use data analysis tools
- Analyze predictive data
- Create machine learning models

In addition, businesses can also use external data sources such as market data, industry data, economic
data, etc. to supplement internal data.

Effective use of data can help businesses make more informed decisions, improve operational efficiency,
enhance competitive advantage and achieve business goals.

Usage data

• Decision:

Evaluate market and competition data: Assist companies in making informed strategic decisions about
their offerings, pricing, marketing, and other areas by providing a deeper understanding of the market,
consumer demands, and competitive landscape.

Operational data analysis: Assists companies in assessing the effectiveness of divisions, procedures, goods,
etc.; it does this by pointing out areas of strength and weakness and providing suggestions for
improvement.

Analyzing customer data helps companies better understand the requirements, tastes, and behavior of
their customers. This helps them make more informed marketing decisions, boost sales, and increase
marketing efficiency.

• Marketing and interaction with customers:

Consumer segmentation: By using data, organizations may create marketing strategies that are tailored to
the specific needs of each consumer group by grouping customers based on shared demographics, buying
habits, hobbies, etc. row. Customize consumer experience: By suggesting goods and services based on the
interests and requirements of individual customers, data assists companies in customizing the customer
experience.

Engage customers: Information enables companies to communicate with clients more efficiently via text,
social media, email, and other methods.

• Risk management:

Determine and analyze risks: Information assists companies in determining possible hazards that may
impact operations, as well as the degree and probability of such risks materializing.

Risk prevention planning: Companies may minimize the harm caused by hazards by developing efficient
risk prevention plans based on data from risk assessments.

Risk management and monitoring: Data enables companies to keep tabs on potential threats and to make
required adjustments to their risk prevention strategies.

• Monitor and improve performance:

Track performance: Using certain measuring indicators (KPIs), data enables firms to monitor the
effectiveness of various departments, processes, products, and other areas.

Determine areas for improvement and areas for weakness: Data assists companies in determining areas
for operational improvement and areas for operational weakness.

Analyze the success of improvement projects: Data enables companies to assess the success of
improvement initiatives and make required modifications.

2.2.3. Tools and techniques are used to analyze and transform data into meaningful information

Tools:

• Spreadsheet software: Microsoft Excel, Google Sheets


• Data visualization tools: Tableau, Power BI, Qlik Sense

Techniques:

• Descriptive analysis: Use statistics like mean, median, and standard deviation to summarize data

• Analyze data for patterns and trends using exploratory analytics.

• Analytics that predict: Make future value predictions using historical and current data.

• Analysis of segments: Sort the data into more manageable groupings based on shared attributes.

• Model analysis is the process of developing mathematical representations of the connections


between data's variables.

There are many tools and techniques used to analyze and transform data into meaningful information to
support business operations. In this report I will talk about tools and techniques related to Excel and Power
BI

 Advantages and disadvantages of Excel and Power BI

Excel

Figure 2: Excel
Advantages Disadvantages

Supports many popular data formats (CSV, XLSX, Limited connectivity to complex data sources (APIs,
TXT, etc.). Simple connection to basic data sources cloud systems). Difficulty handling large amounts
(spreadsheets, SQL databases). Easy to use and of data.
operate.

Provides basic data analysis tools (sorting, filtering, Limited ability to analyze complex data. Requires
calculating, creating charts). Suitable for simple, expertise in VBA or macro programming to
individual data analysis. perform advanced analysis.

Provides a variety of charts and graphs for data Limited customization and advanced charting
visualization. Easily customize and create basic capabilities. Interactive charts and graphs are not
charts. as rich as Power BI.

Supports basic collaboration through the "Share" Limited data sharing and reporting capabilities.
feature. Suitable for small workgroups. Difficulty in managing access rights and securing
data when sharing with many people.

Power BI

Figure 3: Power BI
Advantages Disadvantages

Supports a variety of data formats and connects The interface is more complicated than Excel,
more data sources than Excel. Connect directly to requiring time to get used to. The free version has
large data sources (cloud database, ERP/CRM limited features.

system). Provides the ability to efficiently query and


process data from multiple sources.

Provides more in-depth data analysis features than May be difficult for new users due to complex
Excel (model analysis, machine learning, predictive interface and features.
analysis). Easy to use with intuitive drag and drop
interface, no need for in-depth programming
knowledge. Provides many automation features to
save users time and effort.

Provides a variety of interactive charts and graphs Creating complex charts can require more work
that are more intuitive than Excel. Powerful, easy than Excel.
advanced charting and customization capabilities.
Provides professional dashboard and reporting
features to share analytical information.

Support effective collaboration through the Power The free version does not support full
BI cloud platform. Easily share data, reports, and
collaboration. An internet connection is required to
dashboards with your team. Provides access
use collaboration features.
management and data security features.

 Data Analysis Techniques in Excel and Power BI


Numerous effective methods for evaluating data and turning it into insightful information are available in
both Excel and Power BI. However, each tool is appropriate for a certain approach and set of circumstances
because of its unique characteristics and capabilities.

Both Excel and Power BI support the following basic data analysis techniques:

• Sort and Filter: To highlight crucial information, arrange data in the right order and filter it according
to predetermined standards.

• Data Summary: To summarize data and form preliminary conclusions, compute fundamental
statistical variables like mean, median, total, etc.

• Formulae and Functions: To carry out intricate computations and examine data from several
angles, use built-in or user-defined formulae and functions.

• Charts & Graphs: To quickly see trends, patterns, and connections in data, create visual charts and
graphs.

Additionally, Power BI offers more advanced data analysis techniques than Excel, including:

• Model Analysis: Construct statistical models to categorize data, forecast future values, or explain
correlations between variables.

• Machine Learning: To automatically learn from data, forecast trends, or find hidden patterns, use
machine learning algorithms.

• Predict future values using previous data and influential factors with predictive analytics.

• Segment Analysis: To examine each group in depth, break the data up into smaller groups based
on shared traits.

• Interactive Visual Analysis: Make interactive graphs and charts that let people quickly and easily
examine data graphically.
Figure 4: data analysis techniques

2.3. Assess the value of data and information to individuals and organizations in relation to real-
world business processes

In this report, I will use Power BI to create a dashboard to analyze ABC Manufacturing business data

Link github report : https://github.com/okean8901/REPORT_ASM1_1st_BPS_PHAMLETRUONG_BH01358

Figure 5: Power BI Dash Board (1)


Figure 6: Power BI Dash board(2)

Filter stores, products, states, and dates in Power BI Filter


stores:

• Look for fields related to "Store" or "Store ID" on the dashboard.

• Add a slicer (filter) to this field.

• In the slicer, you will see a dropdown menu with a list of stores. Select one or more stores to filter
data.

Filter products:

• Look for fields related to "Product" or "Product ID" on the dashboard.

• Add a slicer for this field.

• In the slicer, you will have a dropdown menu with a list of products. Select one or more products
to filter data.

Filter state:
• Look for fields related to "State" or "State Name" on the dashboard.

• Add a slicer for this field.

• In the slicer, you will have a dropdown menu with a list of states. Select one or more states to filter
the data.

Filter dates:

• Find filters related to "Day", "Month" or "Year" on the dashboard.

• A specific time period can be selected using date filters.

• For example: Choose a period from January to June 2024.

Displays a sales dashboard for ABC Manufacturing, showing key metrics such as Revenue, Profit, and Profit
Percentage based on various slicers.

The dashboard includes the following visualizations:

• Sum of Revenue: $14.04M

• Sum of Profit: $2.82M

• Profit Percentage: 20.11%

• Sum of Revenue and Sum of Cost by Day: A chart showing the trends of revenue and cost over time.

• Sum of Profit by StateName and StateName: A map visualization showing the profit breakdown by
state.

• Sum of Revenue by RegionName: A pie chart showing the revenue breakdown by region.

• Sum of Profit by RegionName: A pie chart showing the profit breakdown by region.

• Sum of Revenue by StoreID: A bar chart showing the revenue breakdown by store.
• The dashboard allows the sales manager to analyze the company's performance across different
dimensions like region, state, and store by using the available slicers.

• Sum of UnitCost: $7.64K

• Sum of UnitPrice: $13.60

• Sum of Revenue: $14.00

• Sum of Quantity by ProductName: A pie chart showing the breakdown of product quantities, with
the top products being USB-C Docking Station, Compact Digital Camera, and Wireless Router.

• Sum of Income by CustomerName: A bar chart showing the income contribution of different
customers, with the top customers being James Rodriguez, Elijah Cooper, and Ethan Foster.

• Sum of Profit by StoreID: A bar chart showing the profit breakdown by store, with the top
performing stores being ST12, ST26, and ST62.

• Sum of Profit by StoreID: Another bar chart showing the profit breakdown by a different set of
stores, including ST42, ST84, and ST74.

Display the revenue by region using a pie chart:

• Identify the data source that contains the revenue information broken down by region. In this
dashboard, the "Sum of Revenue by RegionName" visualization shows this data.

• Adjust the chart formatting and layout as desired, such as adding labels, changing colors, or resizing
the chart.

• Place the new pie chart visualization on the dashboard alongside the other relevant metrics and
analyses.

• This way, i can effectively display the revenue breakdown by region using an intuitive pie chart
visualization, which can provide valuable insights into the performance and distribution of revenue
across different geographical regions.
• Configure the pie chart to display the "RegionName" dimension on the slice labels and the "Sum of
Revenue" measure as the value to be visualized.

Figure 7: Pie chart

Calculate the top N and bottom N stores by revenue:

Figure 8: Calculate the top N and bottom N stores by revenue

• Identify the data source that contains the revenue information broken down by store. In this

dashboard, the "Sum of Revenue by StoreID" visualization shows this data.

• Create a new visualization, such as a bar chart or a table, to display the store-level revenue data.

• Sort the stores in descending order by the "Sum of Revenue" measure to get the top performers.

• To get the top N stores, i can filter or select the top X number of stores based on the sorted revenue data.

• To get the bottom N stores,i can sort the stores in ascending order by the "Sum of Revenue" measure and
select the bottom X number of stores.
Show the state locations on a map and visualize the profit performance using bubble size

Figure 9: Show the state locations on a map and visualize the profit performance using bubble size

Identify the data source that contains the state-level information (such as latitude and longitude
coordinates for the state centroids) as well as the profit data for each state.

• Create a new map visualization and connect it to the appropriate data source.
• Configure the map visualization to display the state locations using the latitude and
longitude fields for the state centroids.
• Set the bubble size to be proportional to the "Sum of Profit" measure for each state. This
will make the bubble size larger for states with higher profit and smaller for states with lower profit.

Calculate the profit of each product:

Figure 10: Calculate the profit of each product


Determine required data:

• There is information about the revenue of each product.

• There is information about the cost of each product.

Create a measure for each product's profit:

• Profit = Revenue[Revenue] - Revenue[Cost]

Use an area chart to track revenue and cost:

Prepare the data:

• Ensure that you have data fields for revenue and cost.

• If the data is collected over time (e.g., daily, monthly, quarterly), make sure you have a time field
to use as the x-axis.

Create the area chart:

• In Power BI, create a new visualization and select "Area chart" as the chart type.

• Drag the time field onto the X-axis.

• Drag the revenue field onto the Y-axis.

• Drag the cost field onto the Y-axis as well, and select "Stacked" to display it on the same chart.

Figure 11: area chart


Details the Power BI dashboard:

Product Manage Section:

• Product ID: Allows the user to filter the data by different product IDs.

• Category: Allows the user to filter the data by different product categories.

• Day: Allows the user to filter the data by a specific date.

• Sum of Quantity by ProductName: Displays a pie chart showing the breakdown of total quantity by
product name.

• Sum of Income by CustomerName: Displays a bar chart showing the sum of income by customer
name.

• Sum of Profit by StoreID: Displays a bar chart showing the sum of profit by store ID.

Sales Manager ABC Manufacturing Section:

• Region Name: Allows the user to filter the data by different regions.

• Store ID: Allows the user to filter the data by different store IDs.

• State Name: Allows the user to filter the data by different states.

• Date: Allows the user to filter the data by a specific date range.

• Sum of Revenue: Displays the total revenue.

• Sum of Profit: Displays the total profit.

• % Profit: Displays the percentage of profit.

• Sum of Revenue and Sum of Cost by Day: Displays a line chart showing the sum of revenue and cost
over time.

• Sum of Profit by StateName and StateName: Displays a map visualization showing the sum of profit
by state.
• Sum of Revenue by RegionName: Displays a pie chart showing the sum of revenue by region.

• Sum of Revenue by StoreID: Displays a bar chart showing the sum of revenue by store ID.

This Power BI dashboard provides a comprehensive view of the sales performance and product
management for ABC Manufacturing, allowing the sales manager to analyze and make data-driven
decisions.

2.4. Discuss the social legal and ethical implications of using data and information to support
business processes
2.4.1. Privacy and data protection concerns

As technology advances, concerns about privacy and data protection become more pressing.

Collection and use of personal data

Companies and organizations collect large amounts of personal data from users through applications,
websites, and online services. These data may include:

• Personal information (name, address, phone number)

• Financial information (credit card numbers, bank transactions)

• Habits and preferences (browsing history, shopping data)

Control and transparency

• Lack of clear information

Many companies do not provide enough information about how they collect, use and share personal data.

Users often have to read long and complicated terms of service without clearly understanding their rights.

• Limited control
Users have little control over the collection and use of their data. Often they can only accept or reject the
entire terms with no other options.

Security threats

• Cyber attacks: Cyber criminals can attack systems to steal personal data.

• Data leak: Data can be leaked due to employee mistakes or vulnerabilities in security systems.

Consequences of a data breach

• Identity abuse: Stolen personal data can be used for fraud or illegal acts.

• Financial losses: Security breaches can lead to financial losses for both individuals and
organizations.

Regulations and laws

Many countries have enacted regulations to protect personal data, such as:

• GDPR (General Data Protection Regulation): The European Union's General Data Protection
Regulation, which provides rights to users regarding how their data is processed.

• CCPA (California Consumer Privacy Act): California consumer privacy protection law, requiring
companies to be transparent about collecting and sharing personal data.

Figure 12: data security, privacy


Data protection and privacy are complicated topics that are always evolving. To reduce risks and preserve
privacy, individuals and organizations must be watchful and proactive in preserving personal data.

2.4.2. Ethical issues in data collection and use

Consent and transparency

 Informed consent

Explicit Consent: Users must be asked to give their explicit and informed consent to the collection and use
of their data. This includes providing easy-to-understand terms and privacy policies that do not use
complicated or misleading terminology.

Implied consent: Many companies rely on tacit consent, meaning users are deemed to have agreed to the
terms simply by using the service. This does not guarantee that the user actually understands or accepts
such terms.

 Transparent

Open and clear: Organizations need to be open and clear about how they collect, use and share personal
data. Users should clearly know who will have access to their data and for what purpose.

Report a breach: When a data breach occurs, organizations need to promptly and fully notify affected
users.

Control and ownership of data

 Control

Right of access and correction: Users must be able to view their personal data and make any required
corrections or updates.

Right to data deletion: Users need to be able to ask for the removal of their personal information from the
systems used by the company.
 Ownership

Personal data as property: There is disagreement about whether users should be able to sell or rent their
personal data and if it should be regarded as their property.

Economic advantages: When users' data is utilized for commercial reasons, they should be able to reap
some financial benefits from it.

Using data for bad purposes

 Discrimination and bias

Digital discrimination: Individuals can be discriminated against by organizations based on their gender,
color, religion, or other attributes by using personal data. This may occur in domains including hiring,
credit, and insurance.

Algorithmic bias: Algorithms trained on unbalanced data sets for machine learning and artificial
intelligence may inadvertently perpetuate preexisting societal prejudices.

 Surveillance and social control

Mass Surveillance: Extensive data collecting can result in mass surveillance, allowing authorities to keep
tabs on and regulate public behavior.

Behavioral control: Users' freedom and autonomy may be curtailed by the use of personal data to forecast
and manage their behavior.

Protect the rights of disadvantaged groups

• Children: Children's personal data requires special protection, as they may not have a clear
understanding of privacy rights and may not be able to make informed decisions about sharing
their data.
• Elderly people and people with disabilities: Special safeguards are needed to ensure that the
personal data of these groups is not misused or used without full consent.

• Unbalanced power: Large technology companies often have superior power over individual users,
resulting in users having less control and protection over their data.

2.4.3. Legal compliance requirements of using data and information to support business processes

Using data and information to support business processes requires compliance with a variety of legal
requirements that protect individual privacy and ensure data security. Here are some important legal
compliance requirements:

 GDPR (General Data Protection Regulation)

Scope of application: Applies to all organizations that process personal data of EU citizens, regardless of
the organization's location.

Rights of individuals: Include the right to access, rectify, delete and portability of their personal data.

Security requirements: Data must be protected by appropriate technical and organizational measures,
including encryption and access controls.

Report a breach: Organizations must notify the supervisory authority within 72 hours of discovering a
personal data breach.

Figure 13: GDPR


 CCPA (California Consumer Privacy Act)

Scope of application: Applies to businesses that collect personal data of California residents.

Consumer rights: Include the right to know, the right to erasure, the right to refuse the sale of personal
data, and the right not to be discriminated against for exercising these rights.

Transparency: Businesses must provide clear information about the collection and sharing of personal
data.

Figure 14: CCPA

Security and privacy policy

 Privacy policy
Transparency: Organizations need to provide clear and easy-to-understand privacy policies that explain
how and why personal data is collected, used, and shared.

Regular updates: Policies should be updated regularly to reflect changes in data collection and processing.

 Manage consent

Explicit consent: Users must be informed clearly and in detail before consenting to the collection and use
of their data.

Right to withdraw consent: Users must have the right to withdraw consent at any time and this process
must be simple and easy to follow.

Information security

 Security measures

Data encryption: To prevent unwanted access, personal information must be encrypted while it is in transit
and at rest.

Access control: Strict authentication procedures must be followed and only authorized individuals should
have access to personal data.

Employee education: Workers need to be instructed on data protection policy compliance and security
procedures.

 Supervision and inspection

Testing on a regular basis: To find and address vulnerabilities, security systems and procedures need to be
tested on a regular basis.
Constant observation: To identify and quickly address security problems, make use of monitoring tools
and systems.

2.5. Describe common threats to data and how they can be mitigates at on a personal and
organizational level
2.5.1. Internal and external threats to data security

For enterprises of all sizes, data security is a top priority as it entails safeguarding information from both
internal and external threats.

Internal Threats

Internal threats originate from within the organization and can be either intentional or unintentional. They
are often harder to detect and mitigate because they involve trusted individuals with legitimate access to
the organization’s data and systems.

 Insider Threats

Employees or contractors who intentionally misuse their access to steal, leak, or destroy data. This threat
can arise from disgruntled employees, corporate espionage, or financial incentives. Employees who
inadvertently cause data breaches through negligence, such as clicking on phishing links, losing devices, or
mishandling sensitive information.

 Privileged Access Abuse

Employees with more access rights than necessary for their job roles can accidentally or intentionally
misuse this access. Lack of proper monitoring and auditing of privileged accounts can lead to undetected
misuse of sensitive data.

 Unsecured Devices
• BYOD (Bring Your Own Device): Personal devices used for work purposes can be less secure than
company-owned devices, leading to potential data breaches.

• Lost or Stolen Devices: Laptops, smartphones, or USB drives containing sensitive data can be lost
or stolen, exposing the data to unauthorized access.

 Weak Security Practices

• Poor Password Management: Use of weak, reused, or default passwords can make it easier for
attackers to gain unauthorized access.

• Lack of Security Training: Employees who are not trained in security best practices are more likely
to fall victim to social engineering attacks and other threats.

External Threats

External threats originate from outside the organization and are typically carried out by individuals or
groups with no legitimate access to the organization’s data and systems.

Figure 15: External Threats


 Cyber Attacks

Phishing: Attackers use deceptive emails or messages to trick employees into revealing sensitive
information or downloading malware.

Ransomware: Malware that encrypts data and demands a ransom payment for the decryption key. This
can cripple an organization’s operations and lead to data loss.

DDoS (Distributed Denial of Service) Attacks: Overwhelming an organization’s network with traffic, causing
disruptions and potentially diverti ng attention from other malicious activities.

 Hacking and Exploitation

SQL Injection: Attackers exploit vulnerabilities in web applications to gain access to the database and
retrieve, modify, or delete data.

Zero-Day Exploits: Attacks that exploit previously unknown vulnerabilities in software or hardware before
patches or fixes are available.

Credential Stuffing: Using automated tools to try stolen username and password combinations across
multiple sites, exploiting the reuse of credentials.

 Third-Party Risks

Vendors and Contractors: Third parties with access to an organization’s systems and data can pose a
security risk if they do not have robust security measures in place.

Supply Chain Attacks: Compromising a third-party supplier to infiltrate the target organization, as seen in
notable incidents like the SolarWinds hack.

 Physical Security Breaches


Unauthorized Access: Intruders gaining physical access to the organization’s premises can steal or damage
hardware containing sensitive data.

Dumpster Diving: Attackers searching through discarded materials to find sensitive information that has
not been properly destroyed.

2.5.2. Mitigation Strategies

To address these threats, organizations should implement comprehensive security measures, including:

• Access Controls: Enforcing least privilege access and regularly reviewing access permissions.

• Employee Training: Educating employees about security best practices and how to recognize and
respond to threats.

• Incident Response Plan: Developing and regularly testing a plan to respond to data breaches and
other security incidents.

• Data Encryption: Encrypting sensitive data both at rest and in transit to protect it from
unauthorized access.

• Regular Audits and Monitoring: Continuously monitoring systems for suspicious activity and
regularly auditing security practices.

• Strong Authentication: Implementing multi-factor authentication (MFA) to enhance login security.

• Patch Management: Keeping software and systems up to date with the latest security patches to
protect against known vulnerabilities.

2.5.3. Business continuity and recovery plan after the model


Disaster Recovery Plans (DRPs) and company Continuity Plans (BCPs) are two crucial elements that assist
companies in being ready for emergencies, responding swiftly to disasters, and maintaining company
continuity. The processes for creating and implementing DRP and BCP are listed in detail below:

• Risk and Impact Assessment: Identify threats and assess their probability and impact. Business
Impact Analysis (BIA). Identify critical processes and assess the impact if disrupted.

• Develop a Business Continuity Plan (BCP): Ensure there are backup systems and alternative
locations. Create a list of emergency contacts and alternative means of communication. Train staff
and maintain core services.

• Develop a Disaster Recovery Plan (DRP): Recover data and systems. Ensures quick data backup and
recovery. Make a detailed plan of the steps to be taken and assign responsibilities.

• Execute and Check the Plan: Coordinate across departments and document the plan. Conduct tests
and assessments to improve the plan.

• Training and Awareness: Organize training programs and practical training. Implement awareness
campaigns and provide continuous information.

• Maintain and Update the Plan: Review and update the plan at least once a year or when there are
major changes. Ensure all relevant documents are continuously updated.

• Interaction with Stakeholders: Communicate with customers and home. Establish relationships
with authorities and comply with regulations.

• Use of Assistive Technology: Use of communication and project management software. Use cloud
backup solutions and recovery software.

• Evaluate and Continuously Improve: Analyze incidents to learn and improve. Collect feedback and
adopt new technology.

2.6. Analyse the impact of using data and information to support business realworld business
processes
2.6.1. Improve operational efficiency and productivity

Businesses may gain a lot by efficiently using data and information, including:

• Data-driven decision-making: Managers may make better decisions by using data, which offers a
more precise and comprehensive understanding of corporate processes.

• Process Optimization: By identifying areas of weakness and potential for improvement in corporate
processes, data analytics may assist to streamline procedures and cut down on waste.

• Increase productivity by automating processes with data to cut down on the time and effort needed
for repetitive chores.

• Prediction and prevention: Businesses may foresee trends, identify hazards, and plan ahead for
preventative actions with the use of historical data and prediction models.

2.6.2. Challenges and limitations in implementing data-driven strategies

Adopting data-driven tactics has several advantages, but there are drawbacks as well.

• Data quality: Bad judgments might result from missing or inaccurate data. Keeping data quality up
to par is a big task.

• Security and privacy: To reduce the chance of sensitive information being revealed, data collecting
and usage procedures must adhere to security and privacy laws.

• Analytical skills: Companies want personnel and equipment that can properly analyze data. A lack
of appropriate knowledge and resources might lower the usefulness of data.

• Employee resistance to change: Workers may be reluctant to implement data-driven procedures,


particularly if the advantages of the changes are unclear to them.

• Cost: Putting infrastructure, technology, and training for human resources into place to execute
data plans may be costly.
2.6.3. Recommendations to enhance the value of data and information

Businesses should take into account the following suggestions to increase the value of data and
information:

• Create a well-defined data strategy by deciding on its objectives, parameters, and plan of action.
This will guarantee that every action is focused on generating actual value.

• Boost the quality of the data: Create procedures for data quality control that cover data collection,
storage, and analysis to guarantee that the data is always correct, comprehensive, and up to date.

• Invest in engineering and technology: To maximize the use of data, employ automation
technologies, robust data management systems, and contemporary analytics tools.

• Employee education: Employees should get continual training on data analytics techniques,
emerging technology, and the value of data to businesses.

• Data security: Create security guidelines and follow them to keep data safe from harm and to abide
by privacy laws.

• Promote a culture of data: Establish a company culture that encourages the use of data in all
business decisions and where all employees recognize the importance of data.

2.7. Discuss how tools and technologies associated with data science are used to support business
processes and inform decisions
2.7.1. How Data Analytics Supports Business Processes and Decision Making

Figure 16: data driven decision making


The act of gathering, arranging, evaluating, interpreting, and disseminating data in order to identify
significant trends is known as data analytics. It is an effective instrument that may greatly improve
decision-making and corporate operations.

Optimizing Operations

• Finding inefficiencies: Organizations can identify areas where procedures are labor-intensive,
expensive, or prone to mistakes by examining operational data.
• Optimal resource distribution: Information may be used to ascertain how best to distribute
resources like personnel, supplies, and machinery.
• Improving supply chain management: Lead times can be shortened, inventory levels can be
optimized, and overall efficiency can be increased by analyzing supply chain data.

Improving Marketing Strategies

• Comprehending consumer behavior: By examining consumer information, marketing strategies


may be customized by identifying preferences, purchasing patterns, and demographics.
• Campaign effectiveness measurement: Data analytics aids in evaluating the results of marketing
initiatives, enabling modifications to optimize return on investment.
• Client segmentation: Organizations may create tailored marketing plans for various client
categories by recognizing customer segments.

Predicting Trends and Market Changes

• Market forecasting: Future demand and market circumstances may be predicted with the use of
historical data analysis and industry trends.
• Finding new opportunities: Data might reveal unmet client wants or developing market niches.
• Risk assessment: Organizations can recognize possible hazards and create mitigation plans by
examining data.

Additional Benefits of Data Analytics


• Financial performance: By examining financial data, one may find areas for cost reduction,
enhance budgeting, and maximize pricing tactics.
• Human resources: Workforce planning, employee performance reviews, and talent acquisition
may all benefit from data analytics.
• Product development: Innovation and product development can be influenced by the analysis of
market trends and consumer input.

To put it simply, data analytics gives companies the capacity to make decisions based on facts, which
boosts productivity, profitability, and market share.

2.7.2. Popular data science tools

Python programming language

Figure 17: Python

Guido van Rossum created the understandable, syntax-rich, general-purpose programming language
Python, which was originally released in 1991. Python source code is easier to understand and maintain
because it is based on excellent design concepts like "Clear is better than ambiguous" and "Pretty is better
than ugly."
Python’s main strengths include:

• Simple and Readable: Python’s syntax is simple and easy to understand, making it easy for
beginners to quickly pick up and learn the language.
• Versatile: Python can be used to perform a wide variety of tasks such as web programming, data
science, artificial intelligence, automation, and more.
• Rich Libraries: Python has a very rich library ecosystem, providing ready-made tools and modules
for almost every programming need.
• Open Source and Free: Python is a free and open source language that is widely used in the
programming community.
• Good Performance: Although Python is not the fastest programming language, it still has good
performance for many common applications.

SQL (Structured Query Language)

Figure 18:
SQL
Database management systems (DBMS) are managed and manipulated using the computer language SQL.
Developed by IBM in the 1970s, SQL has grown to be one of the most widely used programming languages
in the data processing industry.
Some main features of SQL:

• Structured language: SQL uses clear syntax and structured statements to perform operations with
data, which makes the code easy to read and maintain.
• Query and manipulate data: SQL provides statements such as SELECT, INSERT, UPDATE, DELETE to
query, add, edit, delete data in the database.
• Database definition and management: SQL allows defining the structure of the database, including
tables, columns, constraints, etc.
• Cross-platform: SQL is supported by many popular database management systems such as MySQL,
PostgreSQL, Oracle, Microsoft SQL Server, etc.
• High Performance: SQL is often optimized to handle large amounts of data efficiently.

SQL is an extremely important skill for professionals in data, analytics, programming, databases, etc.
Mastering SQL will help you manage and process data efficiently in many different fields.

Power BI

Figure 19: Power BI

Microsoft created the data analytics and reporting software known as Power BI. Businesses and people
can connect, transform, analyze, and share data in the form of real-time reports, charts, and dashboards
with the aid of this strong and adaptable technology.
Some key features of Power BI:

• Data connectivity: Power BI supports connections to many different data sources such as Excel,
SQL Server, Oracle, Google Analytics, Salesforce, etc.
• Transform and standardize data: Tools in Power BI allow you to easily perform filtering, sorting,
and converting data formats.
• Create reports and dashboards: Power BI provides powerful features for building online reports,
charts and controls.
• Analyze data: Power BI has data analysis tools such as Power Query, Power Pivot and DAX that help
you perform calculations, modeling and analyzing mixed data.
• Sharing and permissions: You can easily share reports and dashboards with other members of your
organization and set up access rights accordingly.
• Extensibility: Power BI can be integrated with other Microsoft applications and services such as
Office 365, Dynamics 365, etc.

Power BI is a very powerful and flexible data analysis tool, widely used in businesses and organizations to
support data-driven decision making. It helps users visualize and analyze data effectively.

Pandas library in python

Figure 20: Pandas


Pandas is a robust and adaptable Python data analysis and processing package. The NumPy library serves
as its foundation, and it offers extremely potent data structures and data processing capabilities.

Some key features of Pandas:

• Series: This is a one-dimensional array with index operations, which can contain different types of
data.
• DataFrame: This is a two-dimensional data structure, similar to a table in a database. It has columns
(columns) and rows (rows) with index operations.
• Data Manipulation: Pandas provides powerful methods and functions to perform operations such
as reading/writing, filtering, sorting, grouping, processing interest values, etc. on data Structures.
• Analysis and visualization: Pandas has built-in tools to perform allowed statistics, plotting, etc.
• Time-based data processing: Pandas has powerful functions and tools for working with time-
related data such as date, time.
• Integration with other libraries: Pandas can be used in conjunction with other analysis and
visualization libraries such as Matplotlib, Seaborn, Scikit-learn.

Pandas is widely used in many fields such as data analysis, machine learning, finance, data science, etc. It
is an extremely useful and essential tool for analysts, researchers and Python programmers.

Jupyter Notebook

Figure 21: Jupyter


Jupyter Notebook is a powerful and flexible tool for collecting, processing, analyzing, and presenting data.
Here are some of the key features of Jupyter Notebook:

• Integrated interface: Jupyter Notebook provides an integrated web interface, allowing users to
edit, execute, and view code results in the same environment.
• Multilingual: Jupyter supports many different programming languages such as Python, R, Julia,
Scala, etc. This allows users to use their preferred language to perform data analysis tasks.
• Markdown integration: Jupyter allows users to use Markdown to add annotations, titles, and
explanations to source code, making it more intuitive and readable.
• Graphics integration: Jupyter supports integrated charting tools such as Matplotlib, Plotly, and
Bokeh, allowing users to create high-quality illustrations.
• Sharing and Collaboration: Jupyter Notebooks can be easily shared and collaborated with others
through tools like GitHub, Nbviewer, and Binder.
• Automation Features: Jupyter Notebooks have automation features like auto-completion, hints,
and automatic execution of code cells.
• Integration with Other Tools: Jupyter Notebooks can be connected to other tools like SQL, Spark,
TensorFlow, and many other cloud computing platforms.

With these features, Jupyter Notebook becomes a very useful tool for data scientists, analysts, and
programmers in performing data analysis tasks efficiently.

2.7.3. The tools help collect and synthesize data

Among the tools mentioned, the two that contribute the most to data collection and aggregation are:

SQL (Structured Query Language):

• SQL is the primary tool for retrieving and aggregating data from relational databases.
• It is extremely efficient in processing large amounts of structured data.
• SQL allows for complex operations such as joining multiple tables, aggregating data with GROUP
BY, and filtering data with WHERE.

Python (with Pandas):


• Python, especially when used with the Pandas library, is a powerful tool for both data collection
and aggregation.
• In terms of data collection, Python can scrape the web, interact with APIs, and read from a variety
of file formats.
• Pandas provides powerful functions for processing, cleaning, and aggregating data, which is
especially effective with structured and semi-structured data.

Both SQL and Python/Pandas have important roles to play, but they are often used for different purposes:

• SQL is often used when working directly with large databases, especially in enterprise
environments.
• Python/Pandas is often preferred for dynamic data analysis, processing data from multiple sources,
and when integrating data collection with more complex processing steps.

The choice between the two tools depends on the specific data source, the scale of the data, and the
requirements of the project. In many cases, both tools are often used in combination to take advantage
of each tool's strengths.

Example :

• Python and Pandas: Used to read CSV files containing daily sales data, then aggregate to calculate
total revenue by product.
• SQL: Query data from an online store's database to get information about customer purchasing
behavior.
2.7.4. Data cleaning tools and solutions

After collecting data, the next step is to clean the data to ensure the data is accurate, consistent and
complete before analyzing.

Python: With libraries such as Pandas, NumPy, Scikit-learn provide functions to handle missing data,
remove outliers, normalize data.

Issues to address when cleaning data:


• Missing data: Fill in missing values using mean, median, or other methods.
• Incorrect data: Correct incorrect values, such as invalid dates.
• Duplicate data: Remove duplicate records.
• Inconsistent data: Normalize data, such as changing date formats.
• Outliers: Identify and handle outliers.

Examples:

• Clean customer data: Check and correct incorrect contact information, remove duplicate record
loops, standardize date of birth format.
• Clean sales data: Check and correct products with non-existent values, remove canceled orders,
recalculate total order value.

2.8. Assess the benefits of using data science to solve problems in real-world scenarios
2.8.1. Tool and technology benefits of ABC Manufacturing project

As a data analyst, I will be working with large volumes of data. Python, Pandas, and Jupyter Notebook are
extremely useful tools that help you process and analyze data effectively.

Python: Is a flexible and powerful programming language, widely used in the field of data science. With
Python, I can:

• Collect data from various sources (CSV, Excel, databases).


• Clean and prepare data.
• Perform statistical calculations and numerical analysis.
• Build predictive models.
• Automate repetitive tasks.

Pandas: Is a specialized Python library for processing and analyzing data. Pandas provides efficient data
structures such as DataFrame and Series, making data manipulation easy.
Jupyter Notebook: Is an interactive environment that allows you to write Python code, run the code, and
visualize the results right in the document. Jupyter Notebook is useful for exploring data, building models,
and generating reports.

Practical examples:

Demand Forecasting:

• Use Pandas to read historical sales data.


• Apply forecasting models (e.g. ARIMA, Prophet) to predict future demand.
• Visualize forecasting results using Jupyter Notebook.

Root Cause Analysis:

• Use techniques such as correlation analysis, regression to find out the factors affecting a particular
problem (e.g. causes of delivery delays).

Supply Chain Optimization:

• Build optimization models to determine the shortest shipping route, efficient inventory allocation.

Customer Sentiment Analysis:

• Use natural language processing techniques to analyze customer reviews and identify areas for
improvement.

2.8.2. Assess the benefits of using data science to solve problems

The use of data science in ABC Manufacturing's company operations yields noteworthy advantages, as it
facilitates enhanced efficiency, more precise decision-making, and the creation of competitive advantages.

Data-driven decision making


• Replace subjective decisions: Instead of relying on personal experience or subjective judgment,
decisions are made based on evidence from data, ensuring objectivity and accuracy.
• Accurate forecasting: Based on forecasting models, businesses can predict market demand and
consumption trends, helping to plan production and business more effectively.
• Root cause analysis: Accurately identify the root causes of problems, thereby providing effective
and sustainable solutions.

For example, ABC Manufacturing may employ forecasting methods to precisely estimate smartphone
demand throughout the Christmas season, preventing shortages or overstocking by modifying output
instead of relying just on gut instinct.

Improve efficiency and productivity

• Optimize processes: By analyzing data on cycle times and bottlenecks in the production process,
businesses can identify activities that need improvement to increase productivity.
• Minimize waste: Analyzing data on raw material and energy consumption helps identify areas of
waste and propose savings measures.
• Increase equipment productivity: Monitor equipment performance data, detect early signs of
failure for preventive maintenance, and reduce machine downtime.

For example, Through the examination of machine waiting time data, ABC Manufacturing is able to
pinpoint the reasons for delays and suggest ways to enhance production efficiency.

Improve customer understanding and personalization

• Customer segmentation: Divide customers into different groups based on purchasing behavior and
preferences to create appropriate marketing campaigns.
• Customer experience personalization: Provide products and services that are tailored to each
customer's needs.
• Enhance loyalty: Build long-term relationships with customers by understanding their needs and
desires.
For example, By creating customized incentives for every client based on their past purchases, ABC
Manufacturing can boost sales and foster a stronger sense of customer loyalty.

Risk mitigation and fraud detection

• Anomaly detection: Analyze data to detect unusual activities, such as fraud, system errors.
• Risk management: Assess and manage potential risks in the business.
• Data security: Protect customer data and business information from cyber attacks.

For example, by analyzing transaction data, ABC Manufacturing can detect unusual transactions and
prevent fraudulent activities.

Other benefits

• Make faster decisions: Data science helps businesses make faster decisions based on real data.
• Improve product quality: Analyzing data on product defects helps identify causes and improve
product quality.
• Increase competitiveness: Businesses that apply data science often have a competitive advantage
over their competitors.

Data science is a potent instrument that aids companies such as ABC Manufacturing in efficiently resolving
business issues. Businesses may improve customer experience, streamline processes, make well-informed
choices, and achieve sustainable development by utilizing data.

2.9. Design a data science solution to support decision-making related to a real-world problem
2.9.1. Problems encountered by ABC Manufacturing when collecting data

The first and most crucial stage in the data analysis process is data collecting. Like many other companies,
ABC Manufacturing could run into some issues throughout this procedure, though. These are a few typical
issues:
Data Quality:

• Incomplete or inaccurate data: This may be the result of non-standard data gathering procedures,
data entry mistakes, or sensor malfunctions.
• Inconsistent data: Different data formats, different units of measure, or inconsistent data
standards.
• Missing data: It's possible that some crucial data was not fully gathered or is missing.

Inconsistent Data

• Different Formats: Data collected from different sources may have different formats (e.g. dates,
units of measure).
• Incompatible Systems: Different systems may use different data standards.

Security and Privacy:

• Data leaks: If sensitive data is not adequately safeguarded, it may be revealed. Examples of this
include financial and consumer information.
• Regulatory Compliance: Particularly for global corporations, data collection and usage must abide
by data protection laws.

Human Resources:

• Absence of Skilled Workforce: Personnel with experience in programming, statistics, and data
analysis tools are necessary for efficient data analysis and utilization.
• Training Difficulties: It can be difficult and time-consuming to teach employees how to utilize and
comprehend data analysis technologies.

Cost:

• Initial investment costs: Purchasing infrastructure, analytics software, and data collecting
technologies can be costly.
• Operating and maintenance expenses: It's also necessary to take into account the costs of keeping
these systems updated and maintained.

2.9.2. Data science solutions to support decision making

Data Collection

• Data Sources: Identifying relevant internal and external data sources (databases, APIs, web
scraping, sensors, etc.).
• Internal: Sales data, warehouse data, production data, customer data (CRM), data from ERP
systems.
• External: Market data, economic data, competitor data, social data (sentiment analysis).
• Data Integration: Combining data from multiple sources into a unified format.

Collection Methods:

• Database Query: Use SQL or similar tools to retrieve data from systems.
• API: Integrate with APIs of third-party platforms to retrieve data.
• Web scraping: Collect data from websites.
• IoT devices: Collect data from sensors in the manufacturing process.

Data Cleaning and Preprocessing

• Handling missing data: Adding missing values using techniques like the mean, median, or using
prediction models.
• Identifying, eliminating, or modifying outliers is known as outlier treatment.
• Data normalization is the process of standardizing data so that it may be compared.
• Transforming textual data into numerical form so that computers can process it is known as data
encoding.

Feature Engineering
• Choosing attributes that will have the biggest influence on the prediction outcomes is known as
feature selection.
• Feature generation is the process of creating new features from preexisting characteristics to
improve the model's capacity for prediction.
• Feature transformation is the process of giving features (such log or square) a mathematical
makeover.

Model Development

• Algorithm Selection: Select the algorithm appropriate to the type of problem (classification,
regression, clustering...).
• Model Design: Build the structure of the model, including the number of hidden layers, the number
of hidden nodes, and the activation function.

Example of Data Science Solution for ABC Manufacturing. Let's consider a specific use case for ABC
Manufacturing: predicting product demand.

Data Collection: Gather sales data, market trends, economic indicators, and social media sentiment.

Data Cleaning: Handle missing values in sales data, address outliers in price fluctuations, and standardize
date formats.

Feature Engineering: Create features like product category, seasonality, promotional activities, and
competitor analysis.

Model Development: Use time series forecasting models (ARIMA, LSTM) to predict demand.

2.9.3. Apply data science tools to solve problems encountered when collecting data for ABC
Manufacturing

Incomplete or inconsistent data:


• Solution: Use imputation techniques such as:
• Mean/median filling: Suitable for numeric data.
• Most frequent filling: Suitable for categorical data.
• Missing value prediction: Use machine learning models to predict missing values based on available
data.

Tool: Pandas

Data contains noise and heterogeneity:

Solution:

• Identify and remove outliers: Use statistical methods such as z-score, IQR.
• Remove duplicate observations: Use duplicate checking functions in libraries such as Pandas.
• Standardize data: Bring data to the same scale (e.g. Min-Max scaling, Standardization).
• Encode data: Convert text data to numeric form (e.g. One-hot encoding, Label encoding).

Tools: Pandas

Data comes in many different formats:

Solution: Use flexible data processing libraries like Pandas to read and combine different formats (CSV,
Excel, JSON).

Supported Data Science Tools

Python: The most popular programming language for data science, with powerful libraries such as:

• Pandas: Data processing and analysis


• NumPy: Numerical computation
• Matplotlib, Seaborn: Data visualization
• Scikit-learn: Machine learning
• R: A programming language specifically for statistics and data analysis.
• SQL: Database query language.

Visualization tools: Tableau, Power BI.

Specific example:

Suppose ABC Manufacturing wants to forecast demand for mobile phones in the next 6 months. The steps
will include:

Data gathering: Gather information about previous sales, marketing initiatives, and new items offered by
rival companies.

Data cleaning: includes processing canceled orders, eliminating negative sales figures, and time-
normalized data.

Feature creation: Provide fresh features like market trends, seasonality, and unique events.

Model construction: Make use of forecasting models like Prophet and ARIMA.

Model evaluation: Sort models according to similarity and select the best model.

Model deployment: To assist with production and warehousing choices, include the model into the
warehouse management system.

By incorporating data science tools and techniques into the data collecting and processing procedures,
ABC Manufacturing is able to more effectively address issues related to data quality, which enhances
forecast accuracy and provides greater support for business decision-making.

2.9.4. Image of the overall architectural design

What is an image of the overall architectural design ?


An image of the overall architectural design is a visual representation that encapsulates the complete
concept of a building or structure. It conveys the architectural form, massing, and spatial relationships of
the design.

A visual example of an overall system design architecture in which end users can build DSS applications
that solve their specific problems:

Figure 22: example of an overall system design architecture

In essence, the overall architectural design visualization is a visual tool that helps bridge the gap between
imagination and built reality.

Common Components of Enterprise Architecture


An abstract model that represents the technology, data, procedures, and personnel that make up an
organization's whole information system is called an enterprise architecture. Typical elements of an
enterprise architecture include the following:

• Business component: Includes business goals, strategies, business processes, organizational units,
and business partners.
• Data component: Includes data types, data structures, data warehouses, and data management
rules.
• Application component: Includes software applications that support business operations, such as
ERP, CRM, SCM.
• Technology component: Includes hardware, operating systems, networks, and other supporting
technologies.
• People component: Includes user roles, responsibilities, skills, and user groups.

❖ Applying the Overall Architecture in ABC Manufacturing

Applying the overall architecture to ABC Manufacturing will help this organization have a comprehensive
view of the information system, thereby optimizing operations, improving efficiency and supporting
strategic decision making.

Suppose ABC Manufacturing wants to improve the efficiency of supply chain management. The overall
architecture will help identify the current systems related to the supply chain, weaknesses and
opportunities for improvement. From there, ABC Manufacturing can build a new architecture, integrating
these systems into a unified platform, helping to improve the ability to forecast demand, manage
inventory and delivery.

Steps to apply the overall architecture in ABC Manufacturing:


STEP 1: Define the goals: Clearly define ABC Manufacturing's business goals and requirements for the
information system.

STEP 2: Analyze the current situation: Evaluate the whole information system, including components,
procedures, and existing issues.

STEP 3: Goal Architecture Design: Create the best possible enterprise architecture that is future-proof by
incorporating new data, technology, and process elements.

STEP 4: Planning for Implementation: Draft a thorough implementation strategy that addresses risks,
resources, and phases of the new architecture.

STEP 5: Implementation and Assessment: To make sure the new architecture satisfies the specified
objectives, carry out the implementation plan and conduct ongoing assessments.

Common Elements of Enterprise Architecture

• Business Element: Includes business goals, strategies, business processes, organizational units, and
business partners.
• Data component: Includes data types, data structures, data warehouses, and data management
rules.
• Application component: Includes software applications that support business operations, such as
ERP, CRM, SCM.
• Technology component: Includes hardware, operating systems, networks, and other supporting
technologies.
• People component: Includes roles, responsibilities, skills of users and user groups.

Specific benefits:

Improved adaptability: ABC Manufacturing can more easily adjust to changes in the market and in
technology thanks to the overall architecture.

Boost operational efficiency: ABC Manufacturing may streamline corporate operations by locating
redundant processes and system gaps.
Cut expenses: By eliminating redundant expenditures and carefully choosing the best solutions, the
holistic architecture helps to reduce the costs associated with IT investments.

Boost service quality: ABC Manufacturing can provide clients better goods and services by making sure
that all of its systems are integrated and consistent.

Encourage the making of strategic decisions: The comprehensive architecture offers a strong basis for
choosing between investment initiatives, the creation of new products, and other company plans.

2.10. Implement a data science solution to support decision making related to a real-world
problem
2.10.1. What is data collection?

The process of gathering data involves combining all available information from various sources,
organizing it into a pre-designed system, and then enabling a person or organization to assess the data
and provide answers to inquiries. Data collecting is done for analytical, research, managerial, business, or
decision-making purposes in domains including science, society, and business.

Figure 23: data collection


The most crucial and initial stage in working with data is data collecting. In data science, the quality and
dependability of the analytical outcomes are determined on appropriate data gathering.

In data science, gathering data is an essential first step. Selecting the appropriate technique for gathering
data can assist guarantee the accuracy and consistency of the analysis's findings.

2.10.2. ABC Manufacturing’s data collection method

Objectives:

• Sales Analysis: Understand which products are selling well, in which stores, and when.
• Inventory Management: Track inventory, forecast demand.
• Customer Analysis: Understand customer purchasing behavior, customer segments.
• Performance Evaluation: Evaluate the effectiveness of marketing campaigns and promotions.

Data Collection Methods:

• Direct Query: To extract data from databases, use SQL commands.


• Data Export: For analysis, export data from systems into files (Excel, CSV).
• API: Connect to systems and automate the process of retrieving data by using API.
• ETL Tools: To extract, convert, and load data into a centralized data warehouse, employ ETL tools
(Extract, convert, Load).

When implementing a data analysis project, the steps to be taken are as follows:

First, it is necessary to clearly identify the information to be collected. Specifically, it is necessary to list the
data fields needed for analysis. Then, identify the data sources containing this information, such as related
systems and data tables.

Next, it is necessary to design SQL queries or scripts to extract data from the identified sources. After
collecting the data, it is necessary to check the accuracy and consistency of the data, and remove duplicate
or missing data.
After the data has been cleaned, it is necessary to integrate data from different sources into a unified data
set. Finally, use data analysis tools such as Excel, Python, R to explore and extract useful information from
the data.

By following these steps, the data analysis project can be carried out systematically and effectively.

2.10.3. Data cleaning and preprocessing


❖ Import packages and load data, display data

Figure 24: python code (1)

I used the code import pandas as pd

• This line imports the pandas library and gives it a short name of 'pd'. Pandas is a popular library in
Python for data processing and analysis.

data = pd.read_csv("asm2_data.csv")

• The pd.read_csv() function is used to read a CSV (Comma-Separated Values) file and convert it into
a pandas DataFrame.
• "asm2_data.csv" is the name of the CSV file you want to read.
• The result is saved to the variable data.

data

• This line displays the contents of the DataFrame data.


• In an interactive environment like Jupyter Notebook, this command will display a table with rows
and columns of data.

Figure 25: data

❖ Removing rows containing blank data

Figure 26: python code (2)

data.dropna(inplace=True)

• The dropna() function is used to remove rows containing NaN (Not a Number) or null values in a
DataFrame.
• The inplace=True parameter means that the change will be applied directly to the DataFrame data
without reassigning.
• After this command, all rows with at least one null value will be removed from data.

data.isnull().sum()
• data.isnull() creates a new DataFrame of the same size as data, where each element is True if the
corresponding value in data is null, and False otherwise.
• .sum() sums each column of the DataFrame resulting from isnull().
• The final result is a Series containing the number of null values in each column of the DataFrame
data.

Figure 27: result

❖ Removing duplicate rows

Figure 28: result duplicate rows

data.duplicated()

• The duplicated() method is applied to the DataFrame data.


• It returns a boolean Series, where:
• True for rows that are duplicates of a previous row.
• False for rows that are not duplicates.

data[data.duplicated()]

• This is how to filter the DataFrame data based on the result of data.duplicated().
• It selects all rows for which data.duplicated() returns True.
• The result is a new DataFrame containing only duplicate rows.

duplicates = data[data.duplicated()]

• Assigns the DataFrame containing duplicate rows to the duplicates variable.

duplicates

• This line displays the contents of the duplicated DataFrame.

Figure 29: code python (3)

drop_duplicates() is a pandas DataFrame method used to remove duplicate rows.

inplace = True is a parameter of the drop_duplicates() method:

When inplace = True, the change will be applied directly to the original DataFrame (data).

If inplace = True is not present, the method will return a new copy of the DataFrame without changing the
original DataFrame.

❖ Change data type


Figure 30: code and result change data type

• ['ProductID'] is used to access the column named 'ProductID' in the DataFrame.


• .dtype is a property of Series in pandas, it returns the data type of the column.
• data['ProductID'] accesses the column 'ProductID' in the DataFrame data.
• .astype(int) is a method used to convert the data type of a Series (column) in pandas.
• int is the data type you want to convert the 'ProductID' column to. In this case, it is an integer.
• The entire expression to the right of the equals sign creates a new Series with the converted data
type.
• The assignment (=) will replace the old 'ProductID' column in the DataFrame data with the new
converted column.

The purpose of this line of code is to convert the data type of the 'ProductID' column to integer.
❖ Replace data

Figure 31:code and result replace data

• ['City'] is the syntax to access a specific column in the DataFrame. Here, 'City' is the name of the
column you want to view.
• When you run this line of code, it will return a pandas Series containing all the values in the 'City'
column.

data.replace():

• This is a pandas DataFrame method that replaces a value in the DataFrame.

'NYC' and 'NEW YORK CITY':

• 'NYC' is the value to replace.


• 'NEW YORK CITY' is the new value that will replace 'NYC'.

inplace = True:

• This parameter specifies that the change will be applied directly to the original DataFrame.
• If this parameter is not included, the method will return a new copy of the DataFrame without
changing the original DataFrame.
• data: This line will display the entire DataFrame after the replacement has been performed.

The purpose of this code is to normalize the data, ensuring consistency in the representation of city names.
In this case, it replaces the abbreviation 'NYC' with the full name 'NEW YORK CITY'.

❖ Merge data

Figure 32: code and result merge data

data['Detail Address']: Creates a new column in the DataFrame named 'Detail Address'.

data['Address']: Accesses the 'Address' column in the DataFrame.

': Adds a space between the two strings.


data['City']: Accesses the 'City' column in the DataFrame.

=: Assigns the result of string addition to the 'Detail Address' column.

This line of code combines the 'Address' and 'City' columns in the DataFrame, adds a space between them,
and creates a new column named 'Detail Address' to store the result. This creates a complete address
column by combining information from two separate columns.

❖ Create a new columns

Figure 33: code and result create new columns

data["Sale"] = data["Price Each"] * data["Quantity Ordered"]

• Create New Column: Create a new column in the DataFrame named "Sale".
• Calculate Value: Calculate the value of each row by multiplying the "Price Each" column by the
"Quantity Ordered" column.
• Assign Value: Assign the result of the multiplication to the newly created "Sale" column.
The "Sale" column will contain the total value of each order.

data.describe()

• Describe Statistics: Calculate and display descriptive statistics of the DataFrame.


• Display Information: Includes the number of rows, mean, standard deviation, minimum, quartile,
maximum values of the numeric columns in the DataFrame.

2.10.4. Apply data science solutions to support decision making

Line chart showing total sales by month

Figure 35: code python show total sales by month

Figure 34: result line chart


What is the best month to sell and how much revenue is generated?

This chart shows total sales by month for a year. From the chart, we can see that sales vary significantly
from month to month. The month with the highest sales is August, reaching a total of over 37. followed
by September and December. Meanwhile, the months with the lowest sales are June and July.

The bar chart shows sales for the top 5 cities

Figure 36: code python shows sale for the top 5 cities

Figure 37: result bar chart


Which city has the best sales revenue?

This chart shows the top 5 cities with the highest sales. The cities are sorted in descending order based on
their total sales value.
Some key takeaways from the chart:
• Seattle has the highest total sales of the top 5 cities.
• Fort Worth and Charlotte are second and third in total sales, respectively.
• Columbus and San Antonio are the fourth and fifth highest sales cities.
• The sales value of the top 5 cities is shown on the vertical axis, with Seattle leading with about 43%
of total sales.
This chart provides an overview of sales performance and identifies the cities that generate the most
revenue or sales for the business or organization behind the data.

Top 5 StoreID with the highest /lowest sales

Figure 38 : code and bar chart of top 5 STIDs with highest sales
This chart shows the top 5 stores with the highest total sales. The x-axis shows the store code, and the y-
axis shows the total sales for each of the top 5 stores.

The chart shows:

• Store ST03 has the highest total sales of the top 5 stores.
• Store ST15 has the second highest total sales.
• Store ST02 has the third highest total sales.
• Store ST05 has the fourth highest total sales.
• Store ST04 has the lowest total sales of the top 5 stores.

This chart allows us to quickly compare the overall sales performance of the top 5 stores in the data set.

Figure 39: code and bar chart of top 5 STIDs with lowest sales

From the chart, we can see that


• Store ST09 has the lowest total sales, around 1000 units.
• Followed by stores ST10, ST14, ST11 and ST07, each with higher total sales than ST09.
• The chart shows a clear difference in total sales between stores, with a large gap between ST07
and the stores with higher sales.

Overall, the chart provides information about the total sales of the 5 stores with the lowest sales

2.11. Make justified recommendations that support decision making related to a real-world
problem
2.11.1. Most common data science techniques

Descriptive Analytics:

• Goal: Summarize historical data to understand the current situation.


• Techniques: Charts, summary tables, descriptive statistics.
• Example: Analyze sales by product, customer, or geographic region.

Diagnostic Analytics:

• Goal: Find out the root cause of a problem.


• Techniques: Correlation analysis, root cause analysis.
• Example: Find out why sales of a product are falling.

Predictive Analytics:

• Goal: Predict future trends based on historical data.


• Techniques: Machine learning, regression modeling, classification.
• Example: Forecast sales for the next quarter.

Prescriptive Analytics:

• Goal: Make recommendations for actions to achieve a goal.


• Techniques: Optimization, simulation modeling.
• For example: Recommend products that should be promoted to increase sales
2.11.2. Use data science techniques to make recommendations

Problem: ABC Manufacturing faces challenges in managing its supply chain efficiently and effectively.

Objective: Analyze business processes and evaluate how data and information can enhance operations.

Potential Problem Areas and Data-Driven Solutions

❖ Problem: Inventory Management

Issue: Excessive inventory leading to increased holding costs or stockouts resulting in lost sales.

Data-Driven Solution:

• Demand Forecasting: To reliably estimate product demand, use machine learning techniques such
as AriMA and Prophet together with time series analysis.
• Inventory Optimization: To ascertain the ideal inventory levels, use inventory management
techniques (EOQ, ABC analysis).
• Anomaly Detection: Use statistical techniques to spot odd demand spikes or sales trends and
modify inventories as necessary.

❖ Problem: Supply Chain Disruptions

Issue: Unexpected events (e.g., natural disasters, supplier issues) disrupt the supply chain.

Data-Driven Solution:

• Risk Assessment: Utilizing information on past occurrences, supplier performance, and outside
variables, identify possible interruptions.
• Supply Chain Visibility: To proactively handle problems, use real-time tracking of shipments and
inventory levels.
• Scenario Planning: Using simulation modeling, create backup plans for various interruption
situations.

❖ Problem: Production Optimization

Issue: Inefficient production processes leading to increased costs and production delays.

Data-Driven Solution:

• Process Mining: Examine manufacturing procedures to find areas that need improvement and
bottlenecks.
• Predictive Maintenance: Make proactive maintenance plans by using sensor data to anticipate
equipment faults.
• Quality Control: To cut down on errors and rework, use data-driven quality control.

❖ Problem: Customer Satisfaction

Issue: Low customer satisfaction leading to churn and negative word-of-mouth.

Data-Driven Solution:

• Customer Sentiment Analysis: Analyze customer feedback (reviews, social media) to identify areas
for improvement.
• Customer Segmentation: Identify customer segments with different needs and preferences.
• Customer Lifetime Value (CLTV): Calculate CLTV to prioritize customer retention efforts.

Data Science Techniques and Tools:

• Python: For data manipulation, analysis, and model building (libraries like Pandas, NumPy, Scikit-
learn).
• SQL: For data extraction and management from databases.
• Jupyter Notebook: For interactive data exploration and visualization.
• Visualization Tools: Tableau, Power BI for creating interactive dashboards.

Machine Learning Algorithms: Regression, classification, clustering, time series analysis.

❖ Recommendations:

Data-Driven Culture: Foster a data-driven culture within the organization to encourage data-informed
decision making.

Data Quality: Ensure data accuracy, consistency, and completeness for reliable analysis.

Collaboration: Collaborate closely with many departments to collect pertinent information and put
solutions into place.

Continuous Improvement: Assess data-driven projects' performance on a regular basis and make required
modifications.

ABC Manufacturing may improve customer happiness, cut costs, increase supply chain performance, and
gain a competitive edge by implementing these data-driven strategies.

III. EVALUATE

This is a good exercise on how businesses can use data and information to support their key business
processes. The report delves into common types of business data and explains how organizations can
harness them to make strategic decisions, improve operational efficiency, and gain competitive advantage.

The exercise's thorough classification of various company data types—including financial, product,
customer, and human resource data—as well as how companies might use them is one of its strongest
points. The paper, for instance, provides a comprehensive explanation of how customer data may be
utilized to segment markets, create efficient marketing plans, and enhance customer satisfaction. In a
similar vein, product data may assist companies in creating new items, streamlining production, and better
understanding consumer wants.
Using data science techniques can help meet a variety of user needs. These techniques can help users:

• Learn more about their needs, preferences, and behaviors by studying user activity data.

• Get personalized advice, forecasts, and recommendations based on user data analysis.

• Get personalized experiences, goods, and services tailored to their specific needs.

• Use the right data analytics model to apply solutions to specific requirements or challenges.

Data science techniques can also help meet a variety of business needs, including:

• To increase operational efficiency, analyze and improve company processes.

• To make informed strategic decisions, forecast consumer needs, market trends, and company risks.

• Create and customize goods and services using data analysis of consumer needs and behaviors.

• Enhance sales and marketing strategies by analyzing campaign performance data

Use operational data analytics to optimize supply chain management and resource allocation.

In addition, the report also highlights the importance of safe and secure data management, especially
when dealing with sensitive customer, employee and stakeholder information. This is important in the
context of increasing cyber security threats and increasingly stringent data protection requirements for
businesses.

Overall, this is a good exercise that demonstrates a deep understanding of the role of data and information
in modern business operations. The report provides a comprehensive view of how businesses can exploit
data to gain a sustainable competitive advantage.

IV. CONCLUSION

This assignment provides an insight into the critical role that data and information play in supporting ABC
Manufacturing’s key business processes. By analyzing different types of business data and how they
support decision making, strategic planning, and operational performance improvement, the exercise
clearly demonstrates the value of applying data science in a manufacturing organization.
ABC Manufacturing can optimize processes, boost operational efficiency, and make better decisions by
utilizing data sources including production, sales, financial, and human resource data. Analyzing
production data, for instance, may assist in locating bottlenecks, streamlining workflows, and cutting
waste. In addition to offering useful information about markets, clients, and company performance, sales
and financial data may also assist a firm in developing a strategic strategy and selecting the right
investments.

The assignment also emphasizes how crucial it is to have a strong information management system that
protects user privacy and security. For ABC Manufacturing to optimize the advantages of data use while
reducing possible hazards such data leaks, cyberattacks, and abuse, this is crucial.

In conclusion, this assignment has shown how incorporating data science into corporate operations may
provide ABC Manufacturing a variety of chances to boost competitive advantage, increase productivity,
and encourage ongoing organizational growth. By demonstrating dedication and appropriate execution,
ABC Manufacturing has the potential to emerge as a model for efficient data utilization in the
manufacturing sector, therefore augmenting its competitiveness and accomplishing its objectives in a
progressively cutthroat landscape.

V. REFERENCES

1. Anon, (2023). Supply Chain – SGL. [online] Available at:


https://seagoldlimited.com/services/supply-chain/ [Accessed 11
Jun. 2024].
2. vnmt (2022). A Complete Guide For NetSuite Inventory Management - Benefits, Challenges,
Features, Cost, Process. [online] VNMT. Available at:
https://www.vnmtsolutions.com/a-complete-guide-for-netsuite-inventorymanagement/.

3. Priharto, S. (2021). Customer Relationship Management: Pengertian, Manfaat, dan Strateginya.


[online] Kledo Blog. Available at: https://kledo.com/blog/customerrelationship-management/.
4. Advantive. (n.d.). InfinityQS - Top-Rated SPC Software | by Advantive. [online] Available at:
https://www.advantive.com/brands/infinity-qs/.
5. localizejs.com. (2018). Understanding The GDPR (General Data Protection Regulation). [online]
Available at: https://localizejs.com/articles/understanding-and-complyingwith-general-data-
protection-regulation/ [Accessed 11 Jun. 2024].
6. esecurityaudit.com. (n.d.). California Consumer Privacy Act (CCPA). [online] Available at:
https://esecurityaudit.com/california-consumer-privacy-act.html [Accessed 11 Jun. 2024].
7. studyonline.ecu.edu.au. (n.d.). Cyber Attacks: Tips for Protecting Your Organisation | ECU Online.
[online] Available at: https://studyonline.ecu.edu.au/blog/what-is-acyber-attack.
8. www.ada-asia.com. (n.d.). The importance of data analytics in business decision-making | ADA.
[online] Available at: https://www.ada-asia.com/insights/the-importance-of-data-analytics-in-
business-decision-
making#:~:text=Data%20analytics%20involves%20a%20process%20of%20collecting%2C%20analy
sing%2C [Accessed 27 Jul. 2024].
9. upgradcampus.com. (n.d.). The Role of Data Science in Business | upGrad Campus. [online]
Available at: https://upgradcampus.com/blog/the-role-of-data-science-in-business-why-every-
professional-should-learn-it/ [Accessed 27 Jul. 2024].
10. Krishnakumar (2017). An Infographic on Popular Python Programming Language. [online] Eduonix
Blog. Available at: https://blog.eduonix.com/2017/09/infographic-popular-python-programming-
language/
[Accessed 27 Jul. 2024].
11. Services, L. (2023). Microsoft D365 Sales and Power BI: Enhance sales with actionable insights.
[online] LITS SERVICES. Available at: https://www.lits.services/microsoft-d365-sales-and-power-bi-
enhance-sales-with-actionable-insights/ [Accessed 27 Jul. 2024].
12. Anon, (2021). Introduction to Pandas Library in Python - codingstreets. [online] Available at:
https://codingstreets.com/python-pandas/
[Accessed 27 Jul. 2024]
13. bizfly.vn. (n.d.). Thu thập dữ liệu là gì? Các phương pháp thu thập dữ liệu chính xác. [online]
Available at: https://bizfly.vn/techblog/thu-thap-du-
lieu.html#:~:text=Thu%20th%E1%BA%ADp%20d%E1%BB%AF%20li%E1%BB%87u%20l%C3%A0%2
0m%E1%BB%99t%20qu%C3%A1%20tr%C3%ACnh
[Accessed 27 Jul. 2024]

You might also like