Big Data: Understanding Big Data: January 2016
Big Data: Understanding Big Data: January 2016
Big Data: Understanding Big Data: January 2016
net/publication/291229189
CITATIONS READS
4 19,790
1 author:
Kevin Taylor-Sakyi
Aston University
2 PUBLICATIONS 4 CITATIONS
SEE PROFILE
All content following this page was uploaded by Kevin Taylor-Sakyi on 13 June 2016.
Kevin Taylor-Sakyi
Engineering & Applied Science
Aston University
Birmingham, England
[email protected]
Abstract—Steve Jobs, one of the greatest visionaries of our brings forth the necessity of this survey report, understanding
time was quoted in 1996 saying “a lot of times, people don’t know Big Data.
what they want until you show it to them”[38] indicating he
advocated products to be developed based on human intuition
II. BIG DATA
rather than research. With the advancements of mobile devices,
social networks and the Internet of Things (IoT) enormous The phenomenon of big data analytics is continually
amounts of complex data, both structured & unstructured are growing as organizations remodel their operational processes
being captured in hope to allow organizations to make better to rely on live data with hope to drive effective marketing
business decisions as data is now vital for an organizations techniques, improve customer engagement, and to potentially
success. These enormous amounts of data are referred to as Big provide new products and services [4] [13]. The questions to be
Data, which enables a competitive advantage over rivals when raised are:
processed and analyzed appropriately. However Big Data
Analytics has a few concerns including Management of Data- 1) What is Big Data? (Section II)
lifecycle, Privacy & Security, and Data Representation. This paper 2) Why is the transformation from traditional analytics to
reviews the fundamental concept of Big Data, the Data Storage Big Data analytics necessary? (Section II)
domain, the MapReduce programming paradigm used in
3) How to meet demand for Computing Resources?
processing these large datasets, and focuses on two case studies
showing the effectiveness of Big Data Analytics and presents how (Section II)
it could be of greater good in the future if handled appropriately. 4) What implications does Big Data have on the evolution
of Data Storage? (Section III)
Keywords—Big Data; Big Data Analytics; Big Data 5) What are the inconsistencies of Big Data? (Section IV)
Inconsistencies; Data Storage; MapReduce; Knowledge-Space 6) How is Big Data mapped into the knowledge space?
(Section V)
I. INTRODUCTION
Frequently referred to as the information age, the A. What is Big Data?
economic industry in the 21st century is highly dependent on Big Data refers to large sets of complex data, both
data. How much data must be processed to be of meaningful structured and unstructured which traditional processing
use? A study conducted by the IDC states only 0.5% of techniques and/or algorithms are unable to operate on. It aims
globally generated data is analyzed [3]. In a world where to reveal hidden patterns and has led to an evolution from a
“every 2 days we create as much information as we did from model-driven science paradigm into a data-driven science
the beginning of time up to 2003” [1] there is a need to bridge paradigm. According to a study by Boyd & Crawford [5] it
data analyzed with current trends to better business models; “rests on the interplay of:
Systematic processing and analysis of big data is the
underlining factor. (a) Technology: maximizing computation power and
algorithmic accuracy to gather, analyze, link, and compare
The purpose of this report is to reflect knowledge and large data sets.
understanding in the intriguing field of big data acquired
through various papers from well-known fellows in the (b) Analysis: drawing on large data sets to identify
computing field as well as informational directories on the patterns in order to make economic, social, technical, and
web. It aims to focus on the importance of understanding big legal claims.
data, envisioning the transformation from traditional analytics
(c) Mythology: the widespread belief that large data sets
into big data analytics, data storage, and the future
offer a higher form of intelligence and knowledge that can
implications they’ll have on business processes and big data in
generate insights that were previously impossible, with the
the years to come.
aura of truth, objectivity, and accuracy.”
By 2020 there will be approximately 20-100 billion
IBM scientists mention that Big Data has four-dimensions:
connected devices [2] leading to more data collection; thus
Volume, Velocity, Variety, and Veracity [7]. Gartner agrees
illustrating a necessity for applying big data analytics. This
with IBM by stating that Big Data consists of “high-volume,
velocity and variety information assets that demand cost-
effective, innovative forms of information processing for Within the model each customer is assigned a unique
enhanced insight and decision making” [8]. number, internally classified as a Guest ID number. Linked
with this is a shoppers purchased products, methods of
Description of the four-dimensions are detailed below
previous payments (coupons, cash, credit cards, etc.), virtual
[39]:
interaction (clicking links in sent e-mails, customer service
• Volume – Current data existing is in petabytes, chat, etc.). Compiling these data sets along with demographic
which is already problematic; it’s predicted that in data that is available for purchase from information service
the next few years it’s to increase to zettabytes providers such as Experian [12] and the alike, Target is able to
(ZB) [39]. This is due to increase use of mobile channel their marketing strategy effectively. In this case, grasp
devices and social networks primarily. a pregnant shopper’s attention by sending a coupon via e-mail
• Velocity – Refers to both the rate at which data is or post, which can be distinguished by analyzing previous
captured and the rate of data flow. Increased effective methods. Additionally, by obtaining shoppers
dependability on live data cause challenges for demographic data, Target is able to trigger a shopper’s habit
traditional analytics as the data is too large and by including other products that may not commonly be
continuously in motion. purchased at Target, i.e. milk, food, toys, etc. Contributing to a
• Variety – As data collected is not of a specific valuable competitive advantage over competitors during the
category or from a single source, there are early years when the model was unrevealed to the public.
numerous raw data formats, obtained from the Big Data should not be looked merely as a new ideology
web, texts, sensors, e-mails, etc. which are but rather as a new environment, one that requires “new
structured or unstructured. This large amount understanding of data collection, new vision for IT specialist
causes old traditional analytical methods to fail in skills, new approaches to security issues, and new
managing big data. understanding of efficiency in any spheres” [27]. This
• Veracity – Ambiguity within data is the primary environment, when analyzed and processed properly enhances
focus in this dimension – typically from noise and business opportunities; however the risks involved should be
abnormalities within the data. taken into account when collecting, storing and processing
Equipping an enterprise with Big Data driven e-commerce these large data sets.
architecture aids in gaining extensive “insight into customer
behaviour, industry trends, more accurate decisions to B. Big Data analytics transformation
improve just about every aspect of the business, from In assessing the grounds on why several organizations are
marketing and advertising, to merchandising, operations, and gravitating towards Big Data analytics, concrete understanding
even customer retention”[9]. However the enormous data of traditional analytics is necessary. Traditional analytical
obtained may challenge the Four V’s mentioned earlier. methods include structured data sets that are periodically
Referring to the following examples, insights into how Big queried for specific purposes [6]. A common data model used
Data brings value to organizations are mentioned. to manage and process commercial applications is the
relational model; this Relational Database Management
United Parcel Service (UPS) grasped the importance of big
Systems (RDBMS) provides “user-friendly query languages”
data analytics early in 2009 leading to the installation of
and provides simplicity that other network or hierarchical
sensors on over 46,000 of its vehicles; the idea behind this was
models are not able to deliver. Within this system are tables,
to attain “the truck's speed and location, the number of times
each with a unique name, where related data is stored in rows
it's placed in reverse and whether the driver's seat belt is
and columns [28].
buckled. Much of the information is uploaded at the end of the
day to a UPS data center and analyzed overnight” [10]. By These streams of data are obtained through integrated
analyzing the fuel-efficiency sensors and the GPS data, UPS databases and provide leverage on the intended use of those
was able to reduce the consumption of fuel by 8.4 million data sets. They do not provide much advantage for the purpose
gallons and reduced the duration of its routes at 85 million of creating newer products and/or services as Big Data does –
miles [10]. leading to the transformation of Big Data analytics.
Andrew Pole, data analyst for American retailer Target Frequent usage of mobile devices, Web 2.0, and growth in
Corporation, developed a pregnancy-predictive model, as the Internet of Things are among a few to mention in
indicated by the name customers are assigned a “pregnancy reasoning behind organizations looking to transform analytical
prediction” [11] score. Furthermore Target was able to gain processes. Organizations are attracted to big data analytics as
insight into how ‘pregnant a woman is’. Pole initially used it provides a means of obtaining real time data, suitable for
Target’s baby-shower-registry to attain insight into women’s enhancing business operations. Along with providing parallel
shopping habits, discerning those habits as they approached & distributed processing architectures in data processing, big
their due date; which “women on the registry had willingly data analytics also enables the following services: “after-sales
disclosed” [11]. Following the initial data collection phase, service, searching for missing people, smart traffic control
Pole and his team ran series of tests, analyzed the data sets and system, customer behavior analytics, and crisis management
effectively concluded to patterns that could be of use to the system” [34].
corporation.
Research conducted by Kyounghyun Park and his
colleagues of the Big Data SW Research Department in South productivity, quality and flexibility” [4]. The power of big data
Korea proved to develop a platform with the purpose of is its ability to bring forth much more intelligent measures of
establishing Big Data as a Service. In facilitating the services formulating decisions.
mentioned above, traditional methods require various
distinctive systems to perform various tasks such as “data C. MapReduce
collection, data pre-processing, information extraction, and The advancements in technology within the last few
visualization” [34]. On the contrary, this web-based platform decades has caused an explosion of data set sizes [29], though
provides developers & data scientist an environment to there’s now more to work with the speed at which these
“develop cloud services more efficiently”, view shared data, volumes of data are growing exceeds the computing resources
“supports collaborative environments to users so that they can available. MapReduce, a programming paradigm utilized for
reuse other’s data and algorithms and concentrate on their “processing large data sets in distributed environments” is
own work such as developing algorithms or services” [34]. looked upon as an approach to handle the increased demand for
Unlike traditional methods where databases are accessible by computing resources [29].
all persons, this platform and similar big data analytic
platforms supports restricted access on different datasets. There are two fundamental functions within the paradigm,
the Map and the Reduce function. The Map function executes
sorting and filtering, effectively converting a data set into
another & the Reduce function takes the output from the Map
function as an input, then completes grouping and aggregation
operations [29] to combine those data sets into smaller sets of
tuples [30]. Figures 1 & 2 represent the computational
formulas to perform MapReduce.
Rather than executing various experiments individually in • “Storage infrastructure must accommodate information
hope to obtain information, researchers grasped the persistently and reliably”
opportunity of establishing a method in obtaining more • “A scalable access interface to query and analyze a vast
concrete data from various locations. In the early years of the quantity of data”
Human Genome Project (HGP) (1993) researchers realized it
was imperative to “assess the contribution of informatics to IV. BIG DATA INCONSISTENCIES
the ultimate success of the project” [24]. This led to the With the emergence of the Information Era, computational
implementation of three different databases used throughout ability to capture data has increased considerably; sources of
the HGP project, [24] prompting a linkage between the these data streams come from all sectors: automobile, medical,
primary databases. As an initial proposal, the following was IT, and more. Majority of data used in Big Data analytics
recommended: “a link between the sequence and mapping come in unstructured formats, primarily obtained from the
database can be made by using the common sequence identity Internet (cloud computing) and social media. According to
between the sequence record and sequences stored in the industry consultant leaders at CapGemini, organizations
mapping database” [24]. Though a reasonable proposal, a key looking to optimize their big data processes need to refer to
requirement for the method to be effective required the tweets, text messages, blogs, and the like to recognize
following: constant, informative and no overlapping of response on specific products or services that may aid in
classification. With the possibility of triggering chaos, discovering new trends [33]. Web 2.0 and the Internet of
protocols were established, i.e. ‘establishing naming Things are among the main sources by which data capture is
conventions’ as several genes were cloned [24] – enabled. Management of data life cycle, data representation,
complimenting Zhuge’s viewpoint on problems having to be and data privacy & security are among the pressing issues
transparent in order to be comprehendible. within big data analytics.
In all, due to the magnitude of data stored, processed and
analyzed for the HGP, about $2.7 billion of U.S. taxpayers A. Management of Data life-cycle
was used in FY 1991 to complete the project; about $4.74 Pervasive sensing through tablets, smart phones, and
billion in comparison to the 2015 FY. Sequencing a human sensor-based-Internet-devices contributes towards the
genome is now possible at an exceedingly low price of $1000 exceptional rates and scale at which data is being obtained,
in a matter of days – outperforming the core of Moore’s Law. surpassing the current storage technologies available. The
usefulness of big data analytics is to reveal value from fresh
The evolution of the cloud has enabled this extreme
change of cost. Infrastructure as a Service (IaaS) providers data; however, the colossal amounts of data being captured
pose an issue for discovering ‘useful’ information at specific
such as Amazon’s S3 storage system & Elastic Compute cloud
(EC2) provides scalable services to store any amount of data times.
& targets organizations to utilize their idle asset at hourly rate. In preventing or limiting the amount of ‘value loss’ within
These infrastructures were originally developed to upkeep its these systems principles should be governed, outlining and
retail businesses, yet because they altered their strategy determining when and which datasets should be either
Amazon is now the most high-profiled IaaS service providers. archived for further usage or discarded [35].
Their Web Services currently holds about 27% of all cloud
IaaS with data centres in 190 countries [25]; because of this B. Data Privacy & Security
innovative concept of shifting from a physical web server to
cloud architecture in 2006, other services catered to specific Privacy & security are considered to be the most important
data storage & management have emerged [23]. issue within big data analytics [36]. As organizations look to
utilize data to provide personalized information for the
Several platforms are now available to aid in the strategy of their operations, legal ramifications & context-
accessibility of data and storage issue raised within the HGP. based are of high concerns.
One of these platforms, DNAnexus, “provides a global
network for sharing and management of genomic data and In the field of medical research several scientists are to
tools to accelerate genomic medicine” [26]. DNAnexus’s web share wide streams of data to gather in finding solutions of
interface offers easy automatic uploads to scalable creating new medications or diagnosing new diseases. It’s
infrastructure of data and “analysis & visualization tools for believed that by 2020 there will be approximately 12 ZBs of
DNA sequencing” [25]. This platform allows researchers and data in the cloud from Electronic Healthcare Records,
organizations to bypass the complication of dealing with Electronic Medical Records, Personal Health Records,
Mobilized Health Records, and mobile monitors alone [37]. digital resources” [4]. This infrastructure provides cross-
The growths of these datasets are in line with the increasing connections in the space - interlocking different modules
importance that they have accumulated. Conserving data within the different dimensions.
privacy is an increasing issue because of the vast amount of Representation of knowledge from big data analytics
data captured; though there are some guidelines & laws to requires multiple links throughout various spaces (i.e.
keep these various data sources secure there are flaws within physical, social, knowledge, etc.) to not only link different
them [36]. With different countries having different policies, a symbols but also to differing individuals [41]. Vannevar
new technique considering a uniform global aspect must be Bush’s establishment of the theoretical Memex machine
established to insure that all data on the cloud are constrained awakened further study and research into interlinking various
equally among the different locations where they’re accessed. spaces through the cyber-space [41].
(I.e. a scientist accessing data in England from a source in
Within the cyber-space, continuous tests must be
China should have the same privacy as in China.)
conducted to ensure the capability of modules being derived
The plethora of context based information (i.e. linking from other modules, no conflicts between modules, and no
personal data of a person’s social media information with partial modules as this could lead to ambiguous results.
other attained data to distinguish new information) for
A number of softwares have been developed to
analytics instigates complexity in defining which sets of data
accommodate the issues brought forth by establishing a well-
are to be classified as sensitive as all sets have different
formulated cognitive cyber-infrastructure. Globus, for
meaning in different context. As a result employing privacy on
example, has been developed to accommodate the integration
such data sets become difficult.
of “higher-level services that enable applications to adapt to
heterogeneous and dynamically changing meta-computing
C. Data Representation environments” [4].
Data representation is the means of representing
Knowledge-space has strong impacts to scientific
information stored on a computer from different types of data:
conceptions that may deliver tools to be used in identifying
texts, numerical, sound, graphical (video & images), etc. In strong points. Value of statements can be tested as a result of
analytics, not only do the datasets come in different types, the knowledge-space [40]. Intelligent manufacturing is a result
their “semantics, organization, granularity, and means of of such space, humans mapping the representation of the
accessibility” are also diverse [35]. Zhuge mentions computers external world into their mind (i.e. social & economical
are difficult to determine the correctness of representation, factors not computable), converging it onto the virtual space,
referring to the concepts of calculating meaning and and allowing the different modules to be interlocked,
explaining the results obtained [4]. displaying weak-points and strong-points, bringing forth the
H. Hu and his colleagues, fellows of the IEEE proposed 4th industrial revolution.
the following as a measure to avoid representation issues:
VI. CONCLUSION
• Presentation of data must be designed not to merely
display singularity of data but rather reflect the “structure, The concept of Big Data analytics is continually growing.
Its environment demonstrates great opportunities for
hierarchy, and diversity of the data, and an integration
organizations within various sectors to compete with a
technique should be designed to enable efficient competitive advantage, as shown in the examples mentioned
operations across different datasets” [35] earlier. The future of medical science is changing dramatically
V. MAPPING INTO KNOWLEDGE-SPACE due to this concept, scientist are able to access data rapidly on
a global scale via the cloud, and these analytics contribute to
After data is captured, processed and analyzed, how are the development of predictive analytic tools (i.e. facilitating
they mapped into other dimensions? Dimensions within the predictive results at primary stages). However as mentioned
big data environment are methods for observing, classifying, (Section IV), there are inconsistencies and challenges within
and understanding a space [4]. Knowledge space acquires the Big Data privacy: sufficient encryption algorithms to conceal
structure of a semantic network where its vertices represent raw data or analysis, reliability & integrity of Big Data, data
knowledge modules and its relations represent relations storage issues and flaws within the MapReduce paradigm.
amongst two knowledge modules [40]. This paper shed light on conceptual ideologies about big data
A multi-dimensional resource space model was proposed analytics and displayed through a few scenarios how it’s
to manage knowledge from multiple dimensions. It provides a beneficial for organizations within various sectors if analytics
way to divide big data into a multi-dimensional resource space are conducted correctly. Further areas to research to grasp a
through multi-dimensional classifications [41][42]. wider understanding of Big Data are data processing & data
transfer techniques.
To bridge the gap between machines and humans in order
to map data into knowledge space appropriately, a cyber-space
REFERENCES
infrastructure must be formulated. The United States defines
this infrastructure as “the converged information technologies [1] A Day in Big Data. 2014. BIG DATA for smarter customer experiences.
[ONLINE] Available at: http://adayinbigdata.com. [Accessed 03
including the Internet, hardware and software that support a November 15].
technical platform where users can use globally distributed
[2] Government Office for Science. 2014. The Internet of Things: making [21] Derek J. deSolla Price. 1986. Little Science, Big Science ... and Beyond.
the most of the Second Digital Revolution. [ONLINE] Available at: [ONLINE] Available at:
https://www.gov.uk/government/uploads/system/uploads/attachment_dat http://www.garfield.library.upenn.edu/lilscibi.html. [Accessed 24
a/file/409774/14-1230-internet-of-things-review.pdf. [Accessed 17 December 15].
November 15]. [22] Simon Robinson. 2012. The Storage and Transfer Challenges of Big
[3] Theo Priestley. 2015. The 3 Elements The Internet Of Things Needs To Data. [ONLINE] Available at: http://sloanreview.mit.edu/article/the-
Fulfil Real Value. [ONLINE] Available at: storage-and-transfer-challenges-of-big-data/. [Accessed 29 December
http://www.forbes.com/sites/theopriestley/2015/07/16/the-3-elements- 15].
the-internet-of-things-needs-to-fulfil-real-value/. [Accessed 01 [23] Forbes. 2012. How Cloud and Big Data are Impacting the Human
December 15]. Genome - Touching 7 Billion Lives. [ONLINE] Available at:
[4] H. Zhuge, Mapping Big Data into Knowledge Space with Cognitive http://www.forbes.com/sites/sap/2012/04/16/how-cloud-and-big-data-
Cyber_Infrastructure, arXiv:1507.06500, 24 July 2015. are-impacting-the-human-genome-touching-7-billion-lives/. [Accessed
[5] Dana Boyd. 2012. CRITICAL QUESTIONS FOR BIG DATA. 19 December 15].
[ONLINE] Available at: [24] Cuticchia, A, 1993. Managing All Those Bytes: The Human Genome
http://www.tandfonline.com/doi/pdf/10.1080/1369118X.2012.678878. Projec. Perspective: Managing All Those Bytes: The Human Genome
[Accessed 30 November 15]. Projec, 262, 47-48.
[6] Doug Laney. 2001. Application Delivery Strategies. [ONLINE] [25] Rehangrabin. 2015. 7 Facts About Amazon Web Services [Infographic].
Available at: http://blogs.gartner.com/doug-laney/files/2012/01/ad949- [ONLINE] Available at: http://www.ruhanirabin.com/facts-about-
3D-Data-Management-Controlling-Data-Volume-Velocity-and- amazon-web-services/. [Accessed 18 December 15].
Variety.pdf. [Accessed 17 December 15]. [26] DNAnexus. 2015. A Global Network for Genomic Medicine. [ONLINE]
[7] IBM Big Data & Analytics Hub. 2014. The Four V's of Big Data. Available at: https://www.dnanexus.com/company. [Accessed 21
[ONLINE] Available at: December 15].
http://www.ibmbigdatahub.com/infographic/four-vs-big-data. [Accessed [27] Smorodin, G, 2015. Big Data-driven world needs Big Data-driven
02 December 15]. ideology. Big Data as the Big Game Changer, 1, 1.
[8] Gartner. 2015. Technology Research. [ONLINE] Available at: [28] Bhardwaj, V, 2015. Big Data Analysis: Issues and Challenges. Big Data
http://www.gartner.com/technology/home.jsp. [Accessed 30 November Analysis: Issues and Challenges, 1, 1-3.
15].
[29] Grolinger, K, 2014. Challenges for MapReduce in Big Data. 2014 IEEE
[9] The Big Data Landscape. 2014. Why Big Data Is A Must In 10th World Congress on Services, 1, 182-183.
Ecommerce. [ONLINE] Available at:
http://www.bigdatalandscape.com/news/why-big-data-is-a-must-in- [30] IBM. 2015. IBM - What is MapReduce. [ONLINE] Available at:
https://www-01.ibm.com/software/data/infosphere/hadoop/mapreduce/.
ecommerce. [Accessed 09 December 15].
[Accessed 29 December 15].
[10] Rosenbush, S, 2013. Big Data in Business. How Big Data Is Changing
the Whole Equation for Business, 1, 3. [31] A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu,
P. Wyckoff and R. Murthy, "Hive: A warehousing solution over a map-
[11] CHARLES DUHIGG. 2012. How Companies Learn Your Secrets. reduce framework," Proc. of the VLDB Endowment, 2(2), pp. 1626-
[ONLINE] Available at: 1629, 2009
http://www.nytimes.com/2012/02/19/magazine/shopping-
[32] K. Grolinger, W. A. Higashino, A. Tiwari and M. A. Capretz, "Data
habits.html?pagewanted=all&_r=1. [Accessed 10 December 15].
management in cloud environments: NoSQL and NewSQL data stores,"
[12] Experian. 2015. A single classification for all marketing channels. Journal of Cloud Computing: Advances, Systems and Application, 2,
[ONLINE] Available at: http://www.experian.co.uk/marketing- 2013
services/products/mosaic/mosaic-in-detail.html. [Accessed 15 December
15]. [33] Capgemini. 2015. Capturing Big Data: Are you ready? (part 1 of 2).
[ONLINE] Available at: https://www.capgemini.com/blog/insights-data-
[13] SaS. 2015. Three big benefits of big data analytics. [ONLINE] Available blog/2015/04/capturing-big-data-are-you-ready-part-1-of-2. [Accessed
at: http://www.sas.com/en_sa/news/sascom/2014q3/Big-data- 28 December 15].
davenport.html. [Accessed 16 December 15].
[34] Park, K, 2015. Web-based Collaborative Big Data Analytics on Big Data
[14] Big Data - A Visual History. 2015. Big Data and the History of as a Service Platform. Web-based Collaborative Big Data Analytics on
Information Storage. [ONLINE] Available at: Big Data as a Service Platform, 1, 564-566.
http://www.winshuttle.com/big-data-timeline/. [Accessed 27 November
[35] Hu, H, 2014. Toward Scalable Systems for Big Data Analytics: A
15].
Technology Tutorial. Toward Scalable Systems for Big Data Analytics:
[15] Alexander Iadarola. 2015. Lars TCF Holdhus. [ONLINE] Available at: A Technology Tutorial, 1, 658-659, 665.
http://dismagazine.com/discussion/73314/tcf-data-awareness/. [Accessed
08 December 15]. [36] Shrivastva, K, 2014. Big Data Privacy Based On Differential Privacy a
Hope for Big Data. 2014 Sixth International Conference on
[16] The Storage Engine. 2015. 1951: Tape unit developed for data storage. Computational Intelligence and Communication Networks, 1, 777-778.
[ONLINE] Available at:
http://www.computerhistory.org/storageengine/tape-unit-developed-for- [37] Asri, H, 2015. Big Data in healthcare: Challenges and Opportunities.
Big Data in healthcare: Challenges and Opportunities, 1, 1-5.
data-storage. [Accessed 21 December 15].
[38] Businessweek. 1998. STEVE JOBS: 'THERE'S SANITY
[17] IBM100 - The IBM Punched Card. 2015. The IBM Punched Card.
[ONLINE] Available at: http://www- RETURNING'. [ONLINE] Available at:
http://www.businessweek.com/1998/21/b3579165.htm. [Accessed 12
03.ibm.com/ibm/history/ibm100/us/en/icons/punchcard/. [Accessed 19
December 15]. December 15].
[18] Gary Brown. 2015. How Floppy Disk Drives Work. [ONLINE] [39] Katal, A, 2013. Big Data: Issues, Challenges, Tools and Good Practices.
Big Data: Issues, Challenges, Tools and Good Practices, 1, 404.
Available at: http://computer.howstuffworks.com/floppy-disk-
drive1.htm. [Accessed 20 December 15]. [40] Peter Jaenecke. 2001. ON THE STRUCTURE OF A GLOBAL
KNOWLEDGE SPACE. [ONLINE] Available at:
[19] Philips Research. 2015. Philips Research - History of the CD - The CD
http://www.iskoiberico.org/wp-
family. [ONLINE] Available at:
content/uploads/2014/09/25_Jaenecke.pdf. [Accessed 28 December 15].
http://www.research.philips.com/technologies/projects/cd/cd-
family.html. [Accessed 21 December 15]. [41] Zhuge, H, 2012. The Knowledge Grid: Toward Cyber-physical Society.
2nd ed. 222 Rosewood Drive, Danvers, MA 01923, USA: World
[20] Backupify. 2015. Bit & Bytes: A History of Data Storage. [ONLINE]
Scientific Publishing Co. .
Available at: https://www.backupify.com/history-of-data-storage/.
[Accessed 22 December 15].
[42] Zhuge, H, 2008. The Web Resource Space Model. 1st ed. New York:
Springer.