HttpFS gateway is the preferred way of accessing the Hadoop filesystem using HTTP clients like curl. Additionally it can be used from from theĀ hadoop fs command line tool ultimately being a replacement for the hftp protocol. HttpFS, unlike HDFS Proxy, has full support for all file operations with additional support for authentication. Given it’s stateless protocol it is ideal to scale out Hadoop filesystem access using HTTP clients.
In this post I would like to show how to install and setup a HttpFS gateway on a secure and kerberized cluster. By providing some troubleshooting topics, this post should also help you, when running into problems while installing the gateway. Continue reading “Installing HttpFS Gateway on a Kerberized Cluster”