Friday, November 23, 2012

Hadoop

Hadoop is an open-source project overseen by the Apache Software Foundation.
Originally based on papers published by Google in 2003 and 2004.

Hadoop consists of two core components
  • The Hadoop Distributed File System (HDFS)
  • MapReduce
Hadoop Ecosystem
  • Pig, Hive, HBase, Flume, Oozie, Sqoop, etc
Distributed systems evolved to allow developers to use multiple machines for a single job
  • MPI
  • PVM
  • Condor
Hadoop Support's
  • Partial Failure
  • Data Recoverability
  • Component Recovery
  • Consistency
  • Scalability

No comments:

Post a Comment