Hadoop Distributed File System
The Hadoop Distributed File System (HDFS) is a sub-project of the Apache Hadoop project. This Apache Software Foundation project is designed to provide a fault-tolerant file system designed to run on commodity hardware.
This file includes concepts on:
HDFS Architecture, HDFS Concepts, Blocks NameNode, Secondary NameNode, DataNode, HDFS Federation
HDFS High Availability, Basic File System Operations
Data Flow, Anatomy of File Read, Anatomy of File Write, Anatomy of a MapReduce Job Run