Are you ready for the world of data science?
As much complicated as the term data science sounds it is something that all of us practice in our daily life; either directly or indirectly. We make decisions every day, some based on data, some on gut or intuition and some based on our likes and dislikes. In business however, many decisions are made solely on the basis of data and as we move forward with the evolution of technology, the data based decision making is going to permeate across all sections of the enterpise.
It is time for all professionals to start understanding the concept of data science and prepare for the future that is soon going to become our present.
Let us look at some aspects that one needs to understand to prepare for this future –
Data science is all about data. You need to learn to love data and get your hands dirty with it. Not as much from a technology point of view but more from an experimentation angle.
Usually, when working with data or rather trying to identify what data makes best sense for you to further work on, it is necessary that the following questions be answered.
- What is the business objective with this data?
- How does the data get generated or which is the right place from where the data needs to be collected?
- What are the necessary data elements needed to meet the business objective?
- Are these data elements clean? If not,
- Why are they not clean/standardized
- What will it take to standardize them? (i.e. what transformations needed?)
- How are the necessary data elements related to each other? 1-to-1 or 1-to-many etc.
- What timeline of data are we supposed to be looking at?
- Is there a need to see snapshots of data and how it changed over time?
Answering these questions is the time consuming part of the analysis process – it involves detailed conversation with the data owners.
This is the fun part where you can really find out what is the data telling you. Some of the standard visualization techniques are mentioned below.
If your data set is not a very big one (we suggest always start with something small that can fit on a spread sheet) you can work with simple tools like Microsoft Excel. There are other specialized visualization tools available which can be used for free and need very minimal training.
It is typically during the visualization exercise that one starts understanding the data better in terms of dependencies and gaps.
Visualization normally should lead to identifying the problem and defining it. Many a time, this is the start to search for solutions. You might then want to use the services of specialist to start looking at ways and means to further analyze the data and come up with probable solutions either using advance mathematical or machine learning techniques.
The future is going to be driven by data the more we embrace technology to improve our lives the more data we will generate and use, so it’s time to reach out to the future and embrace the science of data.