Channel: peterevansbi » Business Intelligence
Viewing all articles
Browse latest Browse all 16

What is “Big Data”


Why all the hype surrounding “Big Data”

To understand you really need to be able to define what the term “Big Data” actually means.  To me the definition is clearly identified by the three V’s.

Volume – Variety – Velocity

According to survey’s published late in 2011 over 1.5 trillion gigabytes of data was created and replicated in that year alone (IDC 5th Annual Survey).  This shows a 100% increase from two years previous and this increase in data production is not expected to slow but to rise exponentially every two years.  This data is not all useful data however all of it can and is being collected.  If the data can be collected then should we not be providing tools to connect, analyze and visualize the results to improve decision making.


With Twitter generating volumes of greater than 6 terabytes of data per day you can see that the sheer volume of data being stored today is exploding.  With some enterprises generate terabytes of data every hour of every day of the year this leads to the current conundrum facing today’s businesses across all industries. As the amount of data available to the enterprise is on the rise, the percent of data it can process, understand, and analyze is on the decline, thereby creating an area of information that is clouded from view – not because we cannot store or retrieve the data but because we cannot process and analyze it quickly enough.


With data in the enterprise becoming complex and including not only traditional relational data, but also unstructured, semi-structured, and raw data from web pages, web log files, search indexes, social media forums and even sensor data from active and passive systems. Much of this information does not lead itself to being stored in traditional systems and enterprises can therefore struggle to store and perform the required analytics to gain understanding from the data because of this.  An organization’s success will rely on its ability to draw insights from the various kinds of data available to it, which includes both traditional and non-traditional.


Along with the other two V’s the velocity of the data being generated has increased over the last thirty years.  To programme a computer used to be an inherently slow process of writing the code, punching cards which were then read by a card punch reader and entered onto the main frame.  Sometimes a hundred lines of code would take a day or more to get loaded into memory and be run.  Today’s generation can create a mobile app in minutes with object orientation and pattern modeling allowing those with no programming skills at all to produce slick data gathering models.  Just as the sheer volume and variety of data we collect and store has changed, so, too, has the velocity at which it is generated and needs to be handled. Today’s enterprises are dealing with petabytes of data instead of terabytes, a constant flow of data at a pace that has made it impossible for traditional systems to handle. To provide effective analytics you need to be able to deal with both the volume and variety of data while it is still in motion, not just when it has come to rest.

Quite simply, the hype around “Big Data” exists today because the world is changing. Through applications and devices not thought of twenty years ago when the Data Warehouse was born we are able to sense and record more things once we have recorded it we naturally want to save it. Through advances in technology, people and devices are collaborating on a level not before seen – I liken it to very first telephone exchange coming on line – we moved away from the process of writing letters to talking on the telephone.  We have now moved away from the process of analyzing just stored static data not because have to but because everything is now so increasingly interconnected and we need to if we are to understand the challenges that lie before us.

Viewing all articles
Browse latest Browse all 16

Trending Articles