In the time since I co-wrote Big Data for Dummies (Wiley, 2013), there has been an explosion in the interest around big data (and big analytics and big workflow). Although that interest, and concomitant marketing, has been exploding, big data implementations have proceeded at a relatively normal pace. Now that we are closing out 2014, I thought it appropriate to reflect on the state of big data and offer some predictions for its evolution.
Two facts substantiated by the current adoption rate are the following:
- Big data is not a single new technology but a combination of existing and new technologies
- The overarching purpose of most big data implementations is to provide actionable insights
In practice, big data implementations look to provide the opportunity to manage huge volumes of disparate data at the right speed and in the right timeframe to allow real-time analysis and reaction. What about the future? What lies ahead for big data?
Prediction 1: Retailers (and marketers) will benefit greatly from investments in big data analysis.
Social media has already had a profound and lasting effect on big data. Arguably, the largest big data implementations in existence are directly or indirectly attributed to social sites (Facebook, Twitter) or social behaviors on “traditional” sites (eBay, Amazon). It seems that people can’t resist sharing (or in many cases over-sharing) the details of their daily lives. The abundance of the data resulting from these behaviors, coupled with confluent mobile-device ubiquity, is giving marketers and retailers an unprecedented glimpse into the daily lives of consumers. In the not-too-distant future, consumers will be individually targeted with unique offers for “stuff.” Although this observation may not seem too “predictive,” it’s the how and when that will be different. Big data analysis of consumer sharing will allow retailers to deliver the offers on products/services you are most likely to buy, at the precise moment when you have proven you are susceptible to buying, to the end point/device you are likely to be using at that moment.
Prediction 2: Data-ingestion rates will increase dramatically.
As more mobile and sensor sources of data emerge, so too will the requirements for rapid ingestion of huge amounts of diverse data types. Storage and processing are relatively inexpensive. Wired and wireless global networks are getting faster, but they will struggle to provide the capacity needed to “feed” some of the planned big data implementations under development. For example, I know of a vendor that has developed an ingestion technology capable of bringing a petabyte of data on board in 24 hours using a cluster of eight multicore servers. The data is processed while it is being ingested and is ready to use, but for what? Just because we can ingest that amount of data, should we do so? The answer will depend on the use case, of course. This is just the beginning, and the potential demands that we understand data life cycles better than we did in the past.
Prediction 3: Data as a service (DaaS) will emerge as a viable alternative to local persistence.
This one is a slight derivative of prediction 2. Although early adopters will look to ingest and process data to gain first-mover advantages, others may choose to wait until more DaaS offerings come to market. Some organizations will want to collect and hold lots of different data. Others might have the need to do so, but not the resources or capacity. Enter data as a service. There is certainly an existing need for access to wider varieties and sources of data. If the needs are near term, DaaS may not be an option. Over the next few months, lots of companies and organizations will be looking to create secure, easily available data in several vertical markets including but not limited to medical/health care, financial, environmental, logistics and social behaviors.
Prediction 4: Collateral tools and technologies will determine adoption velocities.
The requirements to process every increasing volumes and types of data will stress other areas of the technology environment/infrastructure.
- Without faster and better analysis and visualization tools, comprehending relationships, patterns and the like will be nearly impossible
- Since many big data calculations require process decomposition, distribution, marshaling, recovery and auditing, existing tools from the Apache Foundation and other sources fall far short of current needs
- Extract, transform and load (ETL) technologies must evolve to perform their services across a broader set of inputs and outputs, all while providing a dynamic capability for extending to yet unknown forms
There is a prevailing hypothesis that ETLs as they currently exist will quickly fall into disuse because they are based on “legacy” thinking. There is another market opportunity here. Perhaps we use big data techniques to ingest, transform and stage big data?
One thing is certain; the big data landscape is shifting constantly and unpredictably. I have always subscribed to the axiom “design for change, don’t change the design.” With big data, this precept is more important than ever.
Will my predictions hold true? Only time will tell. To paraphrase the comedian Dennis Miller: “That’s just my predictions. I could be wrong.”
About the Author
Al Nugent is co-author of Big Data for Dummies. He is an experienced technology leader and industry veteran of more than three decades and is currently a Principal Consultant with Hurwitz and Associates. Most recently, he was the Chief Executive and Chief Technology Officer at Mzinga, Inc., a leader in the development and delivery of cloud-based solutions for big data, real-time analytics, social intelligence and community management. Before Mzinga, Al was executive vice president and Chief Technology Officer at CA, Inc., where he was responsible for setting the company’s strategic technology direction. He has also served as CTO for Novell, CTO for Xerox, and CIO for American Re. Al is the independent member of the Board of Directors of Adaptive Computing in Provo, UT; chairman of the advisory board of SpaceCurve in Seattle, WA; and a member of the advisory boards of N-of-one in Waltham, MA, and Marlborough Street Partners in Boston, MA.