The most well known technology used for Big Data is Hadoop. Hadoop is used by Yahoo, eBay, LinkedIn and Facebook. It has been inspired from Google publications on MapReduce, GoogleFS and BigTable. As Hadoop can be hosted on commodity hardware (usually Intel PC on Linux with one or 2 CPU and a few TB on HDD, without any RAID replication technology), it allows them to store huge quantity of data (petabytes or even more) at very low cost (compared to SAN bay systems).Table of ContentsIntroductionCommon Hadoop TermsHype surrounding HadoopWhat is HadoopBasic ConceptInstallationAlternate method of Downloading and Installing HadoopInstalling Hadoop on MacFast StartBootstrappingBrowsing to the ServicesExample programMap ReduceOverviewProgramming ModelMapExampleTypesMore ExamplesMap Reduce ExecutionHow Map and Reduce operations are actually carried outMapCombineReduceHDFSCommon example operationsListing filesHow to run hadoop - map reduce jobs without a cluster?With cloudera VM.Trouble Shooting
The most well known technology used for Big Data is Hadoop. Hadoop is used by Yahoo, eBay, LinkedIn and Facebook. It has been inspired from Google publications on MapReduce, GoogleFS and BigTable. As Hadoop can be hosted on commodity hardware (usually Intel PC on Linux with one or 2 CPU and a few TB on HDD, without any RAID replication technology), it allows them to store huge quantity of data (petabytes or even more) at very low cost (compared to SAN bay systems).Table of ContentsIntroductionCommon Hadoop TermsHype surrounding HadoopWhat is HadoopBasic ConceptInstallationAlternate method of Downloading and Installing HadoopInstalling Hadoop on MacFast StartBootstrappingBrowsing to the ServicesExample programMap ReduceOverviewProgramming ModelMapExampleTypesMore ExamplesMap Reduce ExecutionHow Map and Reduce operations are actually carried outMapCombineReduceHDFSCommon example operationsListing filesHow to run hadoop - map reduce jobs without a cluster?With cloudera VM.Trouble Shooting