Big Data and Data Analysis: Offurum Paschal I Kunoch Education and Training College, Owerri

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 35

Big Data and Data

Analysis
Offurum Paschal I
Kunoch Education and Training College, Owerri
(Former NIIT Owerri)
July, 2021
At Kunoch College, we offers a wide variety of training programs.

National Innovative Diploma (OND through JAMB)


1. Computer Software Engineering
2. Networking & System Security
3. Multimedia Technology
4. Computer Hardware Engineering Technology

Plus

Other professional course like


Web Development
Software Application Development
Data Analysis
Programing Languages: Python, R, Java, etc
Oracle
CCNA
Content
1. What is big data
2. Characteristics of big data
3. What is big data analysis
4. Types of big data analysis
5. Processes of big data analysis
6. Big data analytics uses
7. Skill required to be a big data specialist
Test your knowledge

1)What is the fundamental aim of big data analysis


2)What are the tools used for each of the process of big data analysis
3)Which of the skills required to become a big data specialist do you
possess
The One Fundamental aim of the Big Data Analysis is to help businesses
realize the true potential of Big Data in positively influencing business
decisions.
What is big data
• Big Data refers to significant volumes of data that cannot be
processed effectively with the traditional applications that are
currently used.
Characteristics of big data
• To recognize big data, we need to look beyond volume, variety, and
velocity.

• The 10Vs that explains the characteristics of big data


#1: Volume
• Volume is probably the best
known characteristic of big data
#2: Velocity
• Velocity refers to the speed at which data is being generated,
produced, created, or refreshed.
• Facebook claims 600 terabytes of incoming data per day.
• Google alone processes on average more than "40,000 search queries
every second," which roughly translates to more than 3.5 billion
searches per day.
#3: Variety
• Variety of Big Data refers to structured, unstructured, and semi structured data
that is gathered from multiple sources.
• In the past, data is collected from spreadsheets and databases, today data
comes in an array of forms such as emails, PDFs, videos, audios, image, video
files, social media updates, and other text formats there are also log files, click data,
machine and sensor data, etc.
#4: Variability
Data variability also known as spread or dispersion, refers to how
spread out a set of data is.

In Big Data, Variability can also refer to the inconsistent speed at which
big data is loaded into your database.
#5: Veracity
• Data veracity, in general, is how
accurate or truthful a data set
may be.
• In the context of big data, it’s not
just the quality of the data itself
but how trustworthy the data
source, type, and processing of
it is.

Google image
#6: Validity
• Similar to veracity, validity refers to how accurate and correct the data
is for its intended use.
• It is used to describe whether data satisfies user-defined
conditions or falls within a user-defined range.
#7: Vulnerability
• Big data brings new security concerns. After all, a data breach with big
data is a big breach.
• With the increasing size of people’s personal data, they have
started feeling that it is being used to pry into their behavior to
sell them things by different commercial websites.
#8: Volatility
• How old does your data need to be before it is considered irrelevant,
historic, or not useful any longer? How long does data need to be
kept for?
• Before big data, organizations tended to store data indefinitely
• In this world of real time data you need to determine at what point is
data no longer relevant to the current analysis.
#9: Visualization
• Current big data visualization tools face technical challenges due to
limitations of in-memory technology and poor scalability, functionality, and
response time.
#10: Value
The other characteristics of big data are meaningless if you don't derive
business value from the data.

• understanding your customers better


• targeting them accordingly
• optimizing processes, and
• improving machine or business performance.
What is Big Data Analysis
• Big data analysis examines large amounts of raw data to uncover
hidden patterns, correlations and other insights, in order to reach
certain conclusions
• It enables industries and data analytics companies to make more
informed decisions.
• With today’s technology, it’s possible to analyze large amount of data
and get answers from it almost immediately
Types of Data Analysis
Four types of data analytics build on each other to bring increasing value
to an organization.
• Descriptive analytics examines what happened in the past
• Diagnostic analytics considers why something happened by comparing
descriptive data sets to identify dependencies and patterns.
• Predictive analytics seeks to determine likely outcomes by detecting
tendencies in descriptive and diagnostic analyses.
• Prescriptive analytics attempts to identify what business action to take.
The process of big data analysis?
1. collect
2. process
3. clean and
4. analyze
1. Data is collected

Common sources of data :


 internet clickstream data
 web server logs
 cloud and mobile applications
 social media content
 text from customer emails and survey responses
 mobile phone records, and
 machine data captured by sensors connected to the internet of things (IoT). 
2. Data is processed.
• After data is collected and stored in a data warehouse or data lake,
data professionals must organize, configure and partition the data
properly for analytical queries.
• Thorough data processing makes for higher performance from
analytical queries.
3. Data is cleansed for quality.

• Data professionals scrub the data using scripting tools or enterprise


software.
• They look for any errors or inconsistencies, such as duplications or
formatting mistakes, and organize and tidy up the data.
Data is analyzed: Data Mining

Data mining is the process of finding patterns and relationships in large


amounts of data. It’s an advanced data analysis technique, combining
machine learning and AI to extract useful information, which helps
businesses learn more about customers’ needs, increase revenues,
reduce costs, improve customer relationships, and more.
Data is analyzed: Forecasting
Models are built to forecast customer behavior and other future
developments

Marketing departments can use this software to identify emerging customer bases.
Financial and insurance companies can build risk-assessment and fraud outlooks to
safeguard their profitability.
Manufacturing and retail firms can use it to predict fluctuations in demand or how specific
process changes might affect their supply chains.
Data is analyzed: Machine learning
Machine learning, which taps algorithms to analyze large data sets
Machine learning is a branch of artificial intelligence (AI) and computer science which
focuses on the use of data and algorithms to imitate the way that humans learn, gradually
improving its accuracy.

Subsequently driving decision making within applications and businesses, ideally impacting
key growth metrics.
Data is analyzed: Deep Learning
Deep learning, which is a more advanced offshoot of machine learning
Deep learning can be considered as a subset of machine learning. It is a field that is
based on learning and improving on its own by examining computer algorithms. While
machine learning uses simpler concepts, deep learning works with artificial neural
networks, which are designed to imitate how humans think and learn.
For example, in image processing, lower layers may identify edges, while higher layers may
identify the concepts relevant to a human such as digits or letters or faces.
Data is analyzed: Artificial Intelligence (AI)
Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are
programmed to think like humans and mimic their actions.
The term may also be applied to any machine that exhibits traits associated with a human mind
such as learning and problem-solving.
Data is analyzed: Text mining
Text mining (also referred to as text analytics) is an artificial intelligence (AI) technology that uses
natural language processing (NLP) to transform the free (unstructured) text in documents and
databases into normalized, structured data suitable for analysis or to drive machine learning (ML)
algorithms.
Data is analyzed: Business Intelligence
Business intelligence (BI) comprises the strategies and technologies used by enterprises for
the data analysis of business information.
Business intelligence (BI) leverages software and services to transform data into actionable
insights that inform an organization’s strategic and tactical business decisions. These
findings are presented in reports, summaries, dashboards, graphs, charts and maps to
provide users with detailed intelligence about the state of the business.
Data is analyzed: Data Visualization
Data visualization is the graphical representation of information and data.
By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible
way to analyze massive amounts of information to see and understand trends, outliers, and
patterns in data and make data-driven decisions.
A good visualization tells a story, removing the noise from data and highlighting the useful
information.
Big Data analysis uses

• Customer acquisition and retention


• Price optimization
• Supply chain and channel analytics
• Risk management
• Improved decision-making
• Targeted ads
• Product development
Skills required to be a big data specialist
1. Leadership skills
2. Technical skills
3. Analytical skills
4. Creativity
5. Mathematics and statistical skills
6. Business skills
7. Programming skills
8. Data wrangling skills
9. Communication and data visualization skills
10. Hadoop platform
11. Database skills
12. Unstructured data skills
Thank you
• Call only: 08033126347
• WhatsApp only: 08030432729
• Email: [email protected]
• https://
www.simplilearn.com/data-science-vs-big-data-vs-data-analytics-article
• https://
searchbusinessanalytics.techtarget.com/definition/big-data-analytics
• https://www.sas.com/en_us/insights/analytics/big-data-analytics.html
• https://tdwi.org/articles/2017/02/08/10-vs-of-big-data.aspx
• https://www.tableau.com/learn/articles/data-visualization
• https
://www.northeastern.edu/graduate/blog/what-does-a-data-analyst-
do/

You might also like