Understanding Big Data
Understanding Big Data
Understanding Big Data
Unit: 1
• Course Objective
• Course Outcome
• CO and PO Mapping
• What is big data?
• why big data?
• convergence of key trends
• unstructured data
• industry examples of big data
• web analytics
• big data and marketing
• fraud and big data
• Risk and Big Data
• Big data usually includes data sets with sizes beyond the ability of
commonly used software tools to capture, create, manage, and
process the data within a tolerable elapsed time
• Big data is high-volume, high-velocity and high-variety information
assets that demand cost-effective, innovative forms of information
processing for enhanced insight and decision-making.
• Big data is often boiled down to a few varieties including social
data, machine data, and transactional data.
Unit: 1
1. High volume
2. High velocity
3. High variety
• The health care industry is now awash in data: from biological data
such as gene expression, Special Needs Plans (SNPs), next-
generation gene sequence data.
• The health care system is facing severe economic, effectiveness, and
quality challenges.
• The success of this new business model will be dependent on having
access to data created across the entire health care ecosystem.
“Disruptive Analytics”
• The changing health care landscape is an excellent example of
where data science and disruptive analytics can have an immediate
beneficial impact.
• Let’s introduce one of the health care analytics experts we
interviewed, James Golden.
Beard explained that big data is now changing the way advertisers
address three related needs:
1. How much do I need to spend?
2. How do I allocate that spend across all the marketing
communication touch points?
3. How do I optimize my advertising effectiveness against my brand
equity and ROI in real-time.
Unit: 1
• Apache Hadoop is one technology that has been the darling of Big
Data talk.
• Hadoop is an open-source platform for storage and processing of
diverse data types that enables data-driven enterprises to rapidly
derive the complete value from all their data.
• The scale and variety of data have permanently overwhelmed the
ability to cost-effectively extract value using traditional platforms.
We continue our conversation with Mehta later in the book. For the
moment, let’s boil his observations down to three main points:
1. The technology stack has changed. New proprietary technologies
and open-source inventions enable different approaches that make it
easier and more affordable to store, manage, and analyze data.
2. Hardware and storage is affordable and continuing to get cheaper
to enable massive parallel processing.
3. The variety of data is on the rise and the ability to handle
unstructured data is on the rise.
Cetas Approach:
• Cetas is a stealth-mode startup focused on providing an entire
analytics stack in the cloud (or on-premise, if a customer prefers).
Microsoft:
• Microsoft was late to the Hadoop game, but has been making up for
lost time since October. Microsoft is trying ‘to provide a service that
is very easy to consume for customers of any size,’ which means an
intuitive interface and methods for analyzing data.
• We see this trend as the move from intra- to inter- and trans-firewall
analytics.
• Yesterday companies were doing functional silo-based analytics.
Today they are doing intra-firewall analytics with data within the
firewall.
• Tomorrow they will be collaborating on insights with other
companies to do inter-firewall analytics as well as leveraging the
public domain spaces to do trans-firewall analytics .
• https://www.sanfoundry.com/bigdata-questions-answers/
• https://www.tutorialspoint.com/hadoop/hadoop_big_data_o
verview.htm
• https://www.tutorialspoint.com/big_data_analytics/index.ht
m
• https://www.tutorialspoint.com/big_data_tutorials.htm
Q:1 Explain how you can understand the concept of Big Data.
Q:2 Also explain what is big data and why big data.
Q:3 Explain the industry examples of big data in detail.
Q:4 Explain the difference between structured data and unstructured
data.
Q:5 Explain the term Big Data and Marketing.
Q:1 Explain how you can understand the concept of Big Data.
Q:2 Also explain what is big data and why big data.
Q:3 Explain the industry examples of big data in detail.
Q:4 Explain the difference between structured data and unstructured
data.
Q:5 Explain the term Big Data and Marketing.
Q:6 Explain the term Big Data and Healthcare Industry.
Q:7 Explain the process of Big Data and Advertisement.
Q:8 Explain the concept of Cloud and big data
Q:9 Discuss the concept of Hadoop and open source technologies.
1. Michael Minelli, Michelle Chambers, and Ambiga Dhiraj, "Big Data, Big Analytics:
Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley,
2013.
2. P. J. Sadalage and M. Fowler, "NoSQL Distilled: A Brief Guide to the Emerging
World of
3. Polyglot Persistence", Addison-Wesley Professional, 2012.
4. Tom White, "Hadoop: The Definitive Guide", Third Edition, O'Reilley, 2012.
5. Eric Sammer, "Hadoop Operations", O'Reilley, 2012.
6. E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive", O'Reilley, 2012.
7. Lars George, "HBase: The Definitive Guide", O'Reilley, 2011.
8. Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010.
9. Alan Gates, "Programming Pig", O'Reilley, 2011.
Thank You
08/11/2021 Hirdesh Sharma RCA E45 Big Data Unit: 1 117