Big Data – Size or complexity?

There have been a lot of discussions doing the rounds globally about big data and its implications to the business world. Many companies are trying to build on this opportunity to enhance their revenues and increase shareholder value. Many surveys have been conducted in the recent past by many consulting companies and it is very clear that the marketing spends of many technology companies has made a dent with the spending population mainly the CIO’s. A good number of organisation have recognised the fact that they have a lot of data and that there is a need to put it to use to derive mind boggling insights that will alter the course of their business for the better.

It is imperative that organisations do look at their data serious and use it optimally to get better results, but before opening their wallets to spend on these initiatives it is necessary for organisations to understand what is the data they have all about? To dig in a bit deeper we need to look at the reasons why organisations have been collecting data for so long. In many cases collection of data has been to enable internal operationally efficiency like HR, Finance or due to automation efforts where data was collected just because the software that was installed needed it or lastly due to statutory reasons. Unless and until organisations had a well laid strategy to collect data for further use it will be surprising to note that a lot of data organisations have accumulated over the years might actually not be of that much use or might have to be manipulated intelligently to be put to use.

CIO’s therefore need to ensure that before embarking on a big data project (as they are referred to now a days) an audit is conducted on the data available for the project and validate its usefulness for further processing. Another aspect that needs to be looked into is if the data available actually has the necessary attributes for further processing and to what extent in age the data can be used. This will help in getting a better understanding of the size of the project so that too many dollars are not spent on wasteful data migration and data cleansing exercise.