Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…
To help those of us innocent to thrills of big data, we’ve asked Harry Blount, founder of DISCERN, to share his views in this guest article. Harry Blount was a member of the Committee on Forecasting Disruptive Technologies sponsored by the National Academy of Sciences, and since 2009 has been running an investment research firm dedicated to harnessing big data insights.
As a technologist, Blount has focused not only in gathering big data sources but also on developing analytic tools to help process them. By comparison, Majestic Research, now part of ITG Research, ended up hiring analysts to interpret the data and provide insights. Here is Blount’s perspective, which we have excerpted from a DISCERN white paper, “Big Data for Better Decisions”, which can be obtained at http://www.aboutdiscern.com/white-paper.
A vision of the big data future
It is 2015. Acme Commercial Drones has launched its private drone fleet following a 2013 congressional mandate ordering the FAA to integrate drones into commercial U.S. air space. Acme has begun conducting daily flyovers over the major U.S. ports counting the number of cargo ships manifested to carry corn to foreign ports. The average ship is floating 2-3 feet lower in the water than the prior month, suggesting higher than expected corn exports. Another Acme drone flying over the U.S. corn-growing region registers a subtle and non-trendline shift towards rust coloration on the crop suggesting that the late-growth corn season may not be as robust as consensus expectation. Finally, live camera feeds from the newly opened “New Panamax-class” locks at the Panama Canal show that a cargo ship has damaged one of the hydraulic gates, slowing transit to a fraction of capacity.
The purchasing officer of a large soft-drink company uses DISCERN’s Personal Analytics Cloud (PAC) to monitor the global corn transport infrastructure and major corn-growing regions. His PAC generates signals notifying him of these non-trendline events. He immediately executes purchase orders to secure longer-term contracts at current spot prices anticipating that the price of corn is about to spike. His fast action locks in a six-month margin advantage over the competition, allowing the company to report record profits.
What is Big Data?
The phrase “Big Data” refers to massive bundles of datasets, allowing for powerful aggregation and analysis at unprecedented speeds. Some 2.5 quintillion (2.5×1018) bytes of data are created every day2. Mobile devices, RFID readers, wireless sensing networks, video cameras, medical organizations, government entities – to name just few – are collecting an ever growing torrent of data.
It is our belief that most decision-making tools, analytics and processes commonly applied to business have not kept pace with the aggressive explosion of data and the visualization capabilities of big data aggregation engines. Just as there is a growing digital divide between those with access to the internet and those without, we believe there is an equally important analogue for organizations with big data strategies in place, and those without.
One of the more popular definitions of big data is encompassed by the 4Vs – Volume, Velocity, Variety and Veracity. They sound more like bullet points in a brochure advertising high-end sports cars than in a comprehensive description on what your big data package should bring to the table. While the 4V’s may help a vendor sell data feeds and speeds, nowhere does this definition speak to the customer’s need for more insights earlier and more often.
Evaluating Big Data
The real question is – “How can you get more insights more often from big data?” In our opinion, the most critical aspect organizations need to assess when selecting a big data vendor is in the vendor’s ability to convert noise to signals. Said differently, most big data technology vendors will be able to bring you “the world of data” via a big data aggregation platform, but only a few will have a process capable of delivering you the all the relevant data (e.g. structured, unstructured, internal, external) in deep context in a way that is personalized and persistent for your requirements.
In analyzing a big data vendor, it would be wise to ask a few questions:
1) Can the vendor deliver a solution without selling you hardware and without expensive and extensive IT configuration?
2) Does the vendor aggregate data of any type – internal and external, structured and unstructured, public and commercial?
3) What is the data model for organizing the data?
4) What is the process for providing data in its original and curated context?
5) Can the vendor deliver the data in accordance with the context and unique needs of individual users within the organization?
6) Is the data accessible from any browser-enabled device?
7) Does the solution fit into your existing workflow?
During my involvement with the National Academy of Science study “Persistent Forecasting of Disruptive Technologies“, I came to strongly believe that in most cases, all of the information required to make an early and informed decision was available in the public domain or from inexpensive commercial sources. However, there simply was not a tool or service that was optimized to persistently scan and process the data to surface these insights to decision-makers.
We believe human interaction will always be critical to the decision-making process. However, too much time is spent searching, aggregating, cleansing and processing data, and not enough time is spent on value-added activities such as analysis and optimization.
We believe the emergence of big data tools such as DISCERN’s personal analytics clouds (PACs) and other persistent and personalized decisionmaking frameworks have begun to gain traction because they are able to address the shortcomings of traditional tools and processes. These platforms combine the power of big data platforms with the increasingly sophisticated capabilities of advertising networks and on-line shopping vendors.
Harry Blount is the CEO and Founder of DISCERN, Inc., a cloud-based, big data analytics company that delivers signals as a service to decision-makers. The vision for DISCERN was inspired, in part, based on the work of Harry and Paul Saffo, DISCERN’s head of Foresight, during their tenures as members of the National Academy of Science Committee for Forecasting Disruptive Technologies. Prior to founding DISCERN, Harry spent more than 20 years on Wall Street, including senior roles at Lehman Brothers, Credit Suisse First Boston, Donaldson Lufkin & Jenrette, and CIBC Oppenheimer. Harry has been named an Institutional Investor All-American in both Information Technology Hardware and Internet Infrastructure Services, and The Wall Street Journal has recognized him as an All-Star covering the Computer Hardware sector. In addition, he is Chairman of the Futures Committee for The Tech Museum of Innovation in San Jose, California and is on the Silicon Valley Advisory Board of the Commonwealth Club. Harry graduated from the University of Wisconsin – La Crosse in 1986 with a B.S. in Finance.