HDFS Storage Tier – Archiving to Cloud w/ S3

By default HDFS does not distinguish between different storage types hence making it difficult to optimize installations with heterogeneous storage devices. Since Hadoop 2.3 and the integration of HDFS-2832 HDFS supports placing block replicas on persistent tiers with different durability and performance requirements.

Typically applications have the need to distinguish data based on their utilization temperature in HOT, WARM, and COLD, where HOT data is in high demand and currently utilized frequently, WARM data is often requested, and COLD is rarely being accessed. Especially since Spark more and more developers demand optimized storage performance for their applications.

For now the following storage policies are supported by HDFS:

  • Lazy_Persists
  • All_SSD
  • One_SSD
  • Hot (default)
  • Warm
  • Cold

with the corresponding block placement strategies:

  • RAM_DISK: 1, DISK n-1
  • SSD: n
  • SSD: 1, DISK: n-1
  • DISK: n
  • DISK: 1, ARCHIVE: n-1
  • ARCHIVE: n

Here n is the replication factor while RAM_DISK, SSD, DISK, and Archive are the different Storage Types supported. For details you can check here: Archival Storage, SSD & Memory

For the NameNode to be able to make the correct block placements for the policy attached to the data, at the heartbeat interval DataNodes are required to additionally send the attached storage types. At the DataNode the different storage types are configured with each data dir under dfs.datanode.data.dir. A sample config could look like this:

 

Additionally storage policies need to be enabled with dfs.storage.policy.enabled.

As a simple example in this post we are going to attache S3 block storage to HDFS for cloud archival. With cloud storage a HDFS cluster could easily be extended, while sacrificing performance which makes it an ideal archival use case.

Mount S3 to Linux

Mounting S3 to a possible DataNode can be achieved with using s3fs-fuse. First a bucket needs to be created here by using the AWS management console:

DataNodes would not be able to share the exact same remote location and would either need to be separated by folder or in case of S3 by bucket.

On a CentOS 7 a s3fs-fuse install could be achieved like this:

Store the credential of S3 to a password file:

With s3fs-fuse installed we can mount the bucket to the OS. Here we choose to use the directory /gridc:

Execute the s3fs mount as the hdfs user, so the /gridc dir will be owned by hdfs and not root.

If you encounter an exception try:

HDFS Storage Tier

With the folder mounted to S3 we can use it to attache a ARCHIVE storage type to our DataNode. For dfs.datanode.data.dir we will configure the directory /gridc as our archive storage. The configuration could look like this:

 

To demonstrate the functionality we here move the data under /apps/hdp into our archive in S3:

The mover will periodically scan the blocks to make sure that the placement of blocks eventually satisfies the required policies, similar to the HDFS Balancer:

The result can be viewed in the AWS Console:

Further Reading

Leave a Reply

Your email address will not be published. Required fields are marked *