Big data????
A data which is beyond from storage capacity and beyond from processing
capacity
How the data generated (Data generating sorces)?
1. Sensors
2. CC Cam
3. S/N -FB
4. Online Shoppings
5. Airlines
6. Hospitality data
Data gnerated in every minites
4,166,667 post on facebook
3,47,222 tweets on twitter
1,736,111 post on instagram
300hr of video on youtube
18,327 cast the vote on REDDIT
Before hadoop computation is processor bound.
Big data defines with the help of three V’s
1. Volume : Rapidly increasing in (Kilobyte (KB),Megabyte (MB),Gigabyte
(GB),Terabyte (TB),Petabyte (PB),Exabyte (EB),Zettabyte (ZB),Yottabyte
(YB)
2. Velocity: 90 % data generated in last 10 Years.
3. Variety: Structure, semi-structure, unstructured data.
Hadoop History
2003 :- Google launch one file system for storing data. GFS (Google File
System)
2004 :- Google launch one software framework for processing data called
MapReduce.
2004:- Google publish one white paper on GFS & MapReduce
Yahoo work on white paper, publish by google and come out on conclusion
i.e. Hadoop
Who is the inventor of hadoop
Doug Cutting is inventor of hadoop
Doug Cutting is founder of hadoop
Doug Cutting small kids playing with toy elephant and the toy name is
hadoop, so this name was given to technology called as “hadoop”
Hadoop having two components
1. HDFS (Storing huge data called big data)
2. MapReduce (To process this data MapReduce is used)
Hadoop is open source framework overseen by apache software foundation
Hadoop process huge amount of data with the help of cluster of commodity
hardware.