Sunday, December 29, 2013

Big Data

When we hear the term Big Data first thing that comes to mind that a Data which is Big. But what is the definition of Big, do we consider a data which is in TeraBytes as Big Data or do we consider more than this as Big.Getting to this detail is really important.
Lets say i want to transfer a data of 100GB over email is it possible, the answer is no,The reason being Email Client cannot send data of size 100GB, it means it cannot process data of size 100GB. It means this data is Big enoughfor Email Client to handle.Lets say I want to transfer this same data using Wi-Fi or through file sharing then it would certainly be possible.It means this data is not Big for Wi-Fi File Transfer, since it can handle this. How is that the same data is Big for one but not for other
this is due to computational capabilities of each of this applications or services.So we would summarize
Big Data is a Data which is beyond Computational capabilities of certain utility or application or service.
Overall if we see, there are two types of data one which is being saved in conventional system like Databases which is in a structured format.Other is the one which is variety of data with different combinations like Text,Image,Video,Blog,Article,Surveys,Posts etc. which is scattered and which is not in a structured format. So Overall Data is either Structured or UnStrctured.As on Today out of 100% of data which is available Structured data is only 10% and Unstructured Data is 90%.Today's Business Analytics, predictions etc is being done on this Structured data. But to have accurate Analytics and predictions more effort needs to be put to read and process unstructured data.Processing Unstructured Data is the Biggest challenge, because of some important factors i.e its Variety & Volume.
So here we define 3 Attributes of Big Data
1. Variety
2. Velocity
3. Volume
A solid Example of Big Data is Facebook.Everyday Facebook generates 25TB of Data with a huge variety of Text,Audio,Video,Images etc. and at an undefined input velocity.So all 3 attributes of Big Data are covered under Facebook.Who is the source of this Big Data ?? Why and when this gets generated ??
Big Data is being generated by almost all the smart devices around us with rate of almost every second.
To List a few of the sources of the Big Data are as below PC's,SmartPhones,Tablets,Social Networking Sites,Smart Grids,Smart Meters,Mobile Apps,Industrial Automation Devices,Finance Applications,Retail Applications,Many more....

Is all this data which is being generated is really useful, what business needs can it justify ??
Lets discuss about some of the important sources.

SmartPhones is biggest lifeline of all the people across the globe.Today Sales of SmartPhones has outperformed sales of PC. People do store all the personal information in Phone and also keep sharing this data with lot of friends and colleagues.Like some applications like "WatsApp" are the ubiquitous Apps which people do use for sharing the data.SmartPhones are being used for booking movie tickets, do a faceebook update, post tweets on twitter and a lot more thing.This is a lot of data, This is a lot of personal data, which shows the pattern of the person regarding his likes,dislikes etc.

Gone are the olden days when people used to know their electricty usage at the end of the month and used to get shocks due to this bill.Now with smartmeters, people do get to know their usage almost every hour and tune their usage accordingly.Now if we collect all these data and analyze it, it would help us understand the usage required in the country and when they need to borrow the electricity or they could actually lend power to other nations. like canada selling power to US or US selling power to canada.

Social Networking Sites
Social Networking sites like facebook,twitter etc. do allow people to share their personal data,pictures,videos etc. This is a lot of data about a single person regarding what activity the person is doing, what are his likes, dislikes.This also gives information regarding his opinions regarding certain issues,
whether he supports such a case or not.Everyday facebook generates 25TB of data, twitter generates 12TB of data.This is a lot of data.How could we makes this data useful ??. if we gather this data about a person we could actually monetize on that by generating a pattern and predicting what he would be looking for next and presenting that data, so that he can quickly purchase it. Lets say if we understand that he likes to watch hollywood movies then we offer him next upcoming hollywood movie by Tom Hanks and this could generate business for us.

Technology Going forwad....
This are just a few examples where Big Data is being generated. Now with the Technology moving fast pace, we will be having more smart devices around us which will be generating more data, then we actually can think of. We Will be having an instrumented and interconnected world with these new set of devices. Slowly will be moving towards next generation of cities i.e Smart Cities which will have huge set of smart devices interconnected and continuosuly trasnsmitting and processing data which will help businesses,end-users to achieve their goals faster and in a better way.

In a Nutshell....
if we could process all of this data for different causes we could actually achieve a lot
1. Lets say we could predict what could be next natural calamity
2. Which part of my airbus is not functioning well
3. Which devices in my network are consuming too much of power ??
4. will the stock market crash ??
5. Will sales of smartphones go up next month ??

We can Do Real Big Things with Big Data, Just that we need to Think Big !!!

