- About the Course
- Intended Audience
- Syllabus
COURSE OVERVIEW
- The problem space and example applications
- Why don’t traditional approaches scale?
- Requirements
- Hadoop History
- The ecosystem and stack: HDFS, MapReduce, Hive, Pig…
- Cluster architecture overview
- Hadoop distribution and basic commands
- Eclipse development
- The HDFS command line and web interfaces
- The HDFS Java API (lab)
- Key philosophy: move computation, not data
- Core concepts: Mappers, reducers, drivers
- The MapReduce Java API (lab)
- Optimizing with Combiners and Partitioners (lab)
- More common algorithms: sorting, indexing and searching (lab)
- Testing with MRUnit
- Patterns to abstract “thinking in MapReduce”
- The Cascading library (lab)
- The Hive database (lab)
Course Content
Big Data
Hadoop Background
Development Environment
HDFS Introduction
MapReduce Introduction
Real-World MapReduce
Higher-level Tools
next
prev