This Aberdeen report reviews what top performing organizations do to maximize the value that they gain from their big data projects through using analytics to create business insight.
This Aberdeen report reviews what top performing organizations do to maximize the value that they gain from their big data projects through using analytics to create business insight.
This Aberdeen report reviews what top performing organizations do to maximize the value that they gain from their big data projects through using analytics to create business insight.
This Aberdeen report reviews what top performing organizations do to maximize the value that they gain from their big data projects through using analytics to create business insight.
This document is the result of primary research performed by Aberdeen Group. Aberdeen Group's methodologies provide for objective fact-based research and represent the best analysis available at the time of publication. Unless otherwise noted, the entire contents of this publication are copyrighted by Aberdeen Group, Inc. and may not be reproduced, distributed, archived, or transmitted in any form or by any means without prior written consent by Aberdeen Group, Inc.
September 2012 Go Big or Go Home? Maximizing the Value of Analytics and Big Data As prior Aberdeen research has shown (Big Data, Big Moves), managing growing volumes, speed and complexity of data is challenge enough. But, once you have corralled your data, how do you get value from it? This Analyst Insight is based on two sets of data; first, data on the use of management of big data gathered during January 2012, and second, data on the use of Business Intelligence (BI) collected in April 2012 (see sidebar). It shows what top performing organizations do to maximize the value that they gain from their big data projects through using analytics to create business insight. The research shows that organizations working with big data are pursuing self-service analytics, while bringing advanced data management technologies to bear. In addition, organizations that are following best practices for big data analytics are able to meet business demands for data access 32% more often than those that do not. Big Data without Compromise Simply managing big data can be a challenge in itself. Applying analytics to big data to create business value can be a daunting step beyond that. However, successful big data organizations are not only harnessing this information, but are managing and analyzing it in a way that allows them to deliver timely information to business managers (Figure 1). Figure 1: Definition of Big Data Leaders
Source: Aberdeen Group, January 2012 12% 15% 62% 26% 34% 82% 0% 20% 40% 60% 80% 100% Yearly change in amount of accessible data Percent of data accessed for analysis Ability to meet demand for data access Percentage of respondents, n=99 Leaders Followers Analyst Insight Aberdeens Insights provide the analyst's perspective on the research as drawn from an aggregated view of research surveys, interviews, and data analysis Maturity Class Definitions Organizations responding to the January 2012 big data survey were categorized into one of the two maturity classes based on the following framework: Leaders were the top 50% of all companies, based on their performance in the three metrics presented in Figure 1: Ability to meet demand for data access, percent of data accessed for analysis, and yearly change in amount of accessible data Followers were the bottom 50% of all companies based on their performance with the same criteria.
Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 2
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 One measure, more than any other, indicates the success of an analytics project. That is, does the solution get the right information to the right people, at the right time, to enable better decision-making? The average big data organization needed information on a business event within one day, but almost half (49%) reported needing actionable intelligence within the hour. Despite the undisputed challenge of managing a large data set and delivering results in such a short time frame, the Leaders are able to meet this goal 82% of the time Hitting this level of timely information delivery is no mean feat when you have big data to wrestle with. As shown in the following sections, there are a number of things that these companies do in order to be so successful. Organizational Alignment and Commitment Organizations that work with big data believe strongly in empowering their analytics users (Figure 2). Figure 2: Organizational Support for Big Data Projects
Source: Aberdeen Group, January 2012 & April 2012 Organizations with big data are over 70% more likely than other organizations to have BI projects that are driven primarily by the business community, not by the IT group. Partly, this may be indicative of the fact that big data allows new business problems to be tackled, or entirely new products and services to be brought to market (see case study later). Anecdotally, from discussions that Aberdeen has had with technology users and service providers, big data has fired the imagination of what is possible through the application of analytics. Potentially, that imagination may usher 27% 39% 40% 54% 60% 69% 0% 20% 40% 60% 80% Ability to assess data needs from all departments Executive support for data management and BI IT supports user-driven BI projects Percentage of respondents, n=298 Big Data Small Data Big Data Defined Organizations that took part in Aberdeen's April 2012 survey on agile BI and had at least one analytic database larger than 1 Terabyte (1TB) are considered to be organizations using big data. By this measure, 32 companies were judged to be big data users. Organizations with no analytic databases larger than 1TB are termed small data in this research. The January 2012 survey on big data collected information on 99 organizations with more than 5 total terabytes of active business data, spread across analytic databases, applications, file servers and workstations. The 89 organizations that did not fit this criteria formed the small data category.
Big Data vs. Small Data What is it that makes managing and leveraging big data so challenging? Big data typically isn't just about the sheer volume of data. Other factors often drive organizations that have previously only tackled smaller data sets to adopt new practices. These factors can include the velocity of data that comes into the organization, as well as the increasing variety of data sources and types. Previous Aberdeen research (Future Integration Needs: Embracing Complex Data) found strong growth in the number of organizations that are planning to integrate unstructured data, such as office documents and social media data. In addition, IT departments often find themselves under pressure to turn raw data into useful information at an ever faster rate. Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 3
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 in a subtle shift in how business intelligence projects are conceptualized, defined and executed. Projects aimed at the analysis of big data are not focused on routine reporting, or business as normal. Big data analytics projects have the potential to provide insights that can change the dynamics of competition. As a result, big data projects are more likely to have a greater degree of strategic input from business leaders. Given the cost and complexity associated with big data initiatives, many organizations are reluctant to take the plunge. Upgrading a handful of servers is easy to understand, but it can be hard to justify the value in buying hundreds or thousands of new machines and software. However, there can be incredible opportunity in these large scale data management and BI initiatives. As Figure 2 shows, executives that understand the potential and lend their support to such projects are more important for big data projects. Without that support, the necessary investments may never be made. Bringing together multiple data sources from different departments can shed light on business operations and help refine workflows - but only if the project understands the needs of the different stakeholders. Again, the cost and complexity of many big data projects can make it difficult or impractical to change the parameters after implementation. Developing an understanding of how different departments use business data at the outset of the project can avoid many headaches down the road. Organizations using big data are twice as likely to do this as other companies. Self-Service Analytics is Foundational Aberdeen's prior research into agile business intelligence (Agile BI: Three Steps to Analytic Heaven) has highlighted the steady shift to self-service business intelligence that is occurring. Many IT departments are squeezed between two pressures. First, the volume of data coming into the company is overwhelming. Second, business users are often continuously clamoring for new views of data. As a result, IT groups at Best-in-Class organizations are endeavoring to make their business users more self-sufficient in their use of analytics, so that they can focus more resources on the data problem. Unsurprisingly, that same set of circumstances is even more relevant for organizations that are wrestling with big data (Figure 3). As noted earlier, big data allows new classes of problems to be tackled. Those problems are not solved by a relatively passive, managed reporting style of business intelligence. Those problems are best solved by hands-on, interactive analysis. Consequently, as Figure 3 shows, organizations utilizing big data are 32% more likely to provide their analytics users with the ability to drill- down into detailed data. The style of interactivity provided by drill-down capabilities allows users to find detailed information themselves when they need it. Without drill-down capabilities, those more detailed views of data typically must be requested from the corporate IT department, or a skilled BI analyst. All too often, that type of approach means that business managers are not able to find the information they need in the timeframe required to adequately support their decision-making. Information that Fast Facts The average big data organization wants to access and analyze over 50% of all the business data they store. The average big data organization can currently only access and analyze 22% of their data, less than half of their ultimate goal. Given that the average big data organization stores over 1.4 Petabytes of data, their goal involves making 700 Terabytes available for analysis. Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 4
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 arrives too late to inform decision making is no better than information that does not arrive at all. Conversely, drill-down capabilities can enable a BI user to rapidly navigate from summary data to detail information. This type of interaction can provide the basis for "what-if" analysis. For example, a marketing director can quickly understand the impact of an online marketing campaign and fine-tune marketing messages and channels appropriately. The difference in roles and responsibilities for a big data analytics solution is further illustrated in Figure 3. Companies that take advantage of big data are 60% more likely than their peers to have business users that are able to tailor reports, dashboards and analytics without help from skilled IT staff. By empowering the users of analytics to configure precisely what data they see and how they see it, that burden for delivery of the "analytic last mile" is lifted from corporate IT. Instead, the corporate IT group can use its time for more important tasks associated with big data analytics. These tasks include collecting, cleansing, enriching and integrating the diverse data sources that are required by the business. Figure 3: Self-Service Analytics is More Pervasive with Big Data
Source: Aberdeen Group, April 2012 In a similar vein, half of the firms using big data also allow analytics end-users to combine their own data with corporately managed data assets. This level of autonomy adds both empowerment and flexibility. Self-service data integration empowers analytics users to respond rapidly when new data is acquired and needs to be incorporated into active analytics. Data collected for Aberdeen's June 2012 research into agile integration (Beyond Agile Analytics: Is Agile Data Integration Next?) found that the average time required for an IT specialist to integrate a new data source for analytics is 28 days. If business managers are able to perform their own integration in certain circumstances the total time required to perform integration can be 34% 35% 59% 50% 56% 78% 20% 40% 60% 80% Business users can combine their own data with corporate data Business users can tailor reports and dashboards Business users can drill-down to detail Percentage of respondents, n=199 Big data Small data Self-Service Analytics Tools Ensuring that analytics users are able to be as self-sufficient as possible requires that they be equipped with the right tools. Two end-user tools are used more by organizations with big data than others: End-user query tools. Used by 84% of companies analyzing big data. Visual data discovery tools. Used by 56% of companies analyzing big data. Both classes of tools can facilitate the rapid analysis of unanticipated problems or opportunities. As such, they can excel in dynamic business environments with rapidly changing data sets. This is something that other types of BI solutions - such as managed reporting or static dashboards - struggle to achieve.
Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 5
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 drastically reduced. Organizations that enabled analytics users to undertake their own data integration were able to complete integration tasks in an average of just 7 days. And, as with other self-service solutions, allowing business managers to perform simple integration tasks themselves means that highly skilled data integration engineers can be put to use or more complex and challenging data integration problems. Priming the Big Data Engine Providing the right analytics tools to end users is only part of the big data BI solution. Strong data management capabilities are also necessary to ensure that business managers are able to use these tools to their fullest advantage (Figure 4). Figure 4: Top Performers See Gains from New Technology
Source: Aberdeen Group, January 2012 Leaders using big data were more likely to invest in new and emerging database technologies than Followers. Half of Leaders invest in Massively Parallel Processing (MPP) hardware. This type of scalable computer architecture leverages many commodity processors - potentially hundreds or thousands - to tackle large scale analytic problems. Likewise, columnar databases organize data in a different way than more conventional Relational Database Management Systems (RDBMS). A columnar database design is inherently more suitable for addressing many types of database queries. As a result, the columnar approach can deliver dramatic improvements in query performance. 50% 45% 17% 23% 22% 3% 0% 10% 20% 30% 40% 50% 60% MPP hardware Columnar databases Apache Hadoop P e r c e n t a g e
o f
r e s p o n d e n t s ,
n = 6 3 Leaders Followers Role of Predictive Analytics Conventional business intelligence (or analytics) is a powerful tool that helps managers to understand relatively simple relationships with relatively small numbers of variables. However, predictive analytics excels at identifying trends and relationships when there are hundreds or even thousands of variables to consider for each entity of interest. For example, predictive analytics is often used in marketing applications to mine purchasing data from millions of buyers to determine what would be the best marketing offers to propose to defined market segments. Consequently, big data and predictive analytics are natural bedfellows. Aberdeen's big data research survey found that 90% of all companies harnessing big data are also using predictive analytics.
Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 6
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 Apache Hadoop (The Little Elephant in the Big Data World: Hadoop 1.0 Goes Live), based on a technology called MapReduce, is a highly scalable distributed computer architecture. With Hadoop, large networks of computers can be established to decompose and distribute large compute problems across the network. Crucially, Hadoop can also work with unstructured data which is growing dramatically in volume. Leveraging these emerging technologies can help organizations tackle new classes of analytic problems that could not be addressed previously, either because of low performance or high cost. Data quality and cleansing Organizations grappling with big data are more likely to take advantage of data cleansing tools and master data management tools (Figure 5). Figure 5: Ensuring High Quality Data for Analytics
Source: Aberdeen Group, April 2012 In the January 2012 survey on big data, organizations revealed that transactional or structured data was forming the core of their big data initiatives. Over 93% of all companies with more than 5 terabytes of business data rated this data as vital to their big data projects, despite the fact that the overall volume and growth of this information is often a fraction of other data sources. Rather, sources involving internet based data, like social media or clickstream data, or unstructured text-based information, are being used to supplement and augment the more traditional transactional business data. Given how important this structured information is to most big data initiatives, ensuring that this data is clean, accurate and reliable becomes a crucial piece to the puzzle. Master Data Management (MDM) tools can enforce business rules for data formats and end-user access, while data cleansing tools keeps data standardized and accurate. For organizations that handle massive amounts of structured information (often on customers, products and suppliers), automating these tasks is being increasingly attractive. Automation can not only drastically reduce the time 29% 46% 38% 71% 0% 20% 40% 60% 80% Master data management tools Data cleansing tools Percentage of respondents, n=199 Big data Small data Ensuring Access to New Data Ensuring that analytics users have timely access to fresh data is critical to success. Two characteristics of the data management infrastructure are notable: Parallel processing for data integration. Used by 57% of companies analyzing big data, but only 39% of others. Integrating large data volumes into a data store can take many hours. Using scalable integration software that can take full advantage of the available hardware through parallel processing can dramatically cut this time. Users can access data during re-indexing. Sixty- one percent (61%) of companies analyzing big data have this capability, but only 35% of others do so. Re- indexing a data warehouse after data has been updated is necessary to ensure that information can be accessed efficiently. If users are unable to analyze the data while re-indexing occurs that can frustrate their ability to get timely insights. Taking advantage of one or both of these capabilities can cut the time required to get fresh data into the hands of business managers.
Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 7
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 requirements for data management tasks, but also eliminate data entry inconsistencies and other examples of human error. When comparing the Leaders of big data (see sidebar definition on page 2) to the Followers, we see that over 43% of the top performers had automated their data cleansing, while a mere 17% of the Followers had done likewise. The simplest form of data integration is batch integration. Extract, Transform and Load (ETL) allows data to be lifted from a source data system (e.g. ERP or CRM), transformed into a form useful for analytics, and then loaded into an analytic database. While batch integration is used by organizations that manage big data, those organizations are more likely than their peers to take advantage of data replication tools and data virtualization tools (Figure 6). Data replication enables (near) real-time copying and migration of source data to the analytic database. Pre-defined rules allow specific data in the source system to be periodically copied to the data warehouse. Copying data could periodically on a schedule, or could be performed in real time as edits and updates are made to the source database. While there is a spectrum of data integration options, batch updates are typically used to perform sizeable updates infrequently, while data replication is more often called upon to perform frequent, small updates. Figure 6: Data Integration Tools Preferred for Big Data
Source: Aberdeen Group, Month 2012 Data virtualization can be thought of as the integration-free approach to data integration. It is an integration method that aims to avoid moving data to another database at all. Alternatively, data virtualization maps data in the source systems to the data objects required to support big data analytics. When users request data via dashboards, ad hoc query tools, or visual data discovery tools, the data needed to satisfy those requests is retrieved and transformed on-demand, on-the-fly.
36% 34% 47% 59% 0% 20% 40% 60% 80% Data virtualization tools Data replication tools Percentage of respondents, n=199 Big data Small data Fast Facts The big data capabilities provided by Leaders are noticed and appreciated by their end-users. 68% of Leaders reported high levels of trust in their data and data systems. Only 38% of Followers reported similar high levels of trust in their data and data systems. 58% of Leaders reported high levels of satisfaction with the quality and relevance of their business data Only 38% of Followers reported similar high levels of satisfaction in the quality of their business data.
Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 8
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 Case in Point Zilliant Zilliant was formed in 1999 as a Business-to-Business (B2B) pricing technology company to help customers optimize the profitability of their products and services. Data visualization and predictive analytics are key parts of delivering Zilliants Software-as-a-Service (SaaS) solutions. Zilliant handles hundreds of customer datasets for their SaaS clients, each averaging 60 Gigabytes (GB) of data, with the largest topping 950 GB. This data is growing between 4% and 8% monthly, or 72% every year. In order to keep up with this growth, they require predictive analytics and data visualization tools that allow their clients to quickly and dynamically analyze and display their large volumes of data along with the associated guidance Zilliant has produced. Zilliants clients are B2B companies that set pricing based on a number of factors, such as business type, service levels, and volume discounts. The range of possible price points can be staggering a client might have one hundred thousand products and five thousand customers, yielding five hundred million potential price-points. These companies need to provide their sales representatives with accurate guidance to determine the pricing that will best grow their bottom line. One of the main challenges Zilliant had to overcome was that their customers often have incomplete data to support these strategies. To fill in their customers data gaps, Zilliant adopted a predictive analytics solution that allowed their customers to see the invisible and ultimately facilitate data-driven pricing and sales campaigns. Providing visualization to our clients business people and subject-matter experts is a key differentiator for Zilliants solution, said Eric Hills, Senior Vice President at Zilliant. The potential that the data holds to inform the business is unlimited. In our experience, we believe that the information is in the eye of the beholder: you need to get the right people the right visualization before you can turn this information into actionable insight. To that end, Zilliant invested heavily in how they present data to business people so that they can understand the context and make sound business decisions. A SaaS solution for big data analytics is not without its challenges, however. One of the primary hurdles is the acquisition of data from customers existing systems into the cloud, notes Beth Weeks, Zilliants Senior Vice President of Product Engineering and SaaS Operations. Moving forward, Beth indicated that real-time integration should improve the data on- boarding process, and allow seamless delivery of information into existing customer systems. Looking back, Beth notes that they have had to build a lot of technology and processes for onboarding the data; integration in the cloud would have helped us a lot. Indeed, she predicts that when cloud technology matures, there are big opportunities for Integration as a Service. In the meantime, Zilliant continues to rely on the predictive analytics and data visualization capabilities of their price optimization solution in order to better drive value for their customers. Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 9
2012 Aberdeen Group. Telephone: 617 854 5200 www.aberdeen.com Fax: 617 723 7897 Key Takeaways Organizations that have already embarked on their big data projects - and those that have yet to do so - should consider the following: Self-Service analytics is the key. With BI across the board, leading organizations are working to make their users more self- sufficient. Overall, companies are struggling with two pressures. First, business managers are continually asking for new information, or different views of existing information. Second, the IT group is struggling to manage the growth in both data volume and variety. The answer - for many organizations - is to empower business managers to create those new views of data themselves. That is especially true where big data is concerned. After all, where big data is involved, it is only natural that the IT organization is more challenged by the data aspects of the analytics problem. Seventy- eight percent (78%) of organizations with big data provide interactive analytics solutions that include drill-down capabilities. Half of those organizations take a further step, allowing business users to integrate their own data with corporately managed data assets. And almost all - 90% - leverage predictive analytics technologies. Advanced database technologies may provide an advantage. While traditional relational databases (RDBMS) and structured data will continue to play an important part in business activities and big data initiatives, companies should explore additional options based on their needs. Columnar databases can reduce look-up times for large data sets, and when combined with advanced analytic platforms can drastically streamline data analysis. The Hadoop Distributed File System (HDFS) and other NoSQL databases can store and manage unstructured or semi-structured data, allowing organizations to glean intelligence from new data sources. The Leaders are blazing new trails by adopting these systems, with 17% adopting Apache Hadoop, compared to only 3% of Followers. Appropriate integration methods are important. Batch integration using ETL is tried and tested but it does have its limitations. When batches get too big, or updates are need too frequently, it becomes impractical. Either the burden on corporate IT becomes too great, or analytics users suffer from a lack of timely access to data. In these situations, data replication may be a better option. Likewise, data virtualization can be a powerful solution since it obviates the need to move data at all. Fifty-nine percent (59%) of organizations with big data use data replication capabilities, while 47% also take advantage of data virtualization. For more information on this or other research topics, please visit www.aberdeen.com
Go Big or Go Home? Maximizing the Value of Analytics and Big Data Page 10
Related Research Agile or Fragile? Your Analytics, Your Choice; July 2012 Beyond Agile Analytics: Is Agile Data Integration Next; June 2012 Managing the TCO of BI: The Path to ROI is Paved with Adoption; May 2012 The State of Master Data Management, 2012: Building the Foundation for a Better Enterprise; May 2012 High Performance Organizations Empower Employees with Real-Time Mobile Analytics; April 2012 Picture this: Self-Service BI through Data Discovery &Visualization; March 2012 The Little Elephant in the Big Data World: Hadoop 1.0 Goes Live; March 2012 SaaS BI: The Compelling Economics of Cloud-based Analytics; February 2012 Operational Intelligence - Part 1: Driving Performance with Tactical Visibility; February 2012 In-memory Computing: Lifting the Burden of Big Data; January 2012 Data Management for BI; January 2012 Agile BI: Complementing Traditional BI to Address the Shrinking Decision-Window; November 2011 Agile BI: Three Steps to Analytic Heaven; April 2011 Author: David White, Senior Research Analyst, Business Intelligence, ([email protected]), Nathaniel Rowe, Research Analyst, ([email protected])
For more than two decades, Aberdeen's research has been helping corporations worldwide become Best-in-Class. Having benchmarked the performance of more than 644,000 companies, Aberdeen is uniquely positioned to provide organizations with the facts that matter the facts that enable companies to get ahead and drive results. That's why our research is relied on by more than 2.5 million readers in over 40 countries, 90% of the Fortune 1,000, and 93% of the Technology 500. As a Harte-Hanks Company, Aberdeens research provides insight and analysis to the Harte-Hanks community of local, regional, national and international marketing executives. Combined, we help our customers leverage the power of insight to deliver innovative multichannel marketing programs that drive business-changing results. For additional information, visit Aberdeen http://www.aberdeen.com or call (617) 854-5200, or to learn more about Harte-Hanks, call (800) 456-9748 or go to http://www.harte-hanks.com. This document is the result of primary research performed by Aberdeen Group. Aberdeen Group's methodologies provide for objective fact-based research and represent the best analysis available at the time of publication. Unless otherwise noted, the entire contents of this publication are copyrighted by Aberdeen Group, Inc. and may not be reproduced, distributed, archived, or transmitted in any form or by any means without prior written consent by Aberdeen Group, Inc. (2012a)