Distcp between two HA Cluster

With HDFS High Availability two nodes can act as a NameNode to the system, but not at the same time. Only one of the nodes acts as the active NameNode at any point in time while the other node is in a standby state. The standby node only acts as a slave node preserving enough state to immediately take over when the active node dies. In that it differs from the before existing SecondaryNamenode which was not able to immediately take over.

From a client perspective most confusing is the fact how the active NameNode is discovered? How is HDFS High Availability configured? In this post we look at how for example to distribute data between two clusters in HA mode.

The state of what node is the current active NameNode can be looked up in ZooKeeper where the active node is selected through leader election. Alternatively the active NameNode can also be looked up through Ambari management console.

Obviously the easiest way to use distcp between two HA clusters would be to identify the current active NameNode and run distcp like you would with two clusters without HA:

hdfs distcp hdfs://active1:8020/path hdfs://active2:8020/path

The other alternative is to configure the client with both service ids and make it aware of the way to identify the active NameNode of both clusters. For this you would need to define a custom configuration you are only going to use for distcp. The hdfs client can be configured to point to that config like this:

$ hdfs
Usage: hdfs [--config confdir] COMMAND
where COMMAND is one of:
dfs run a filesystem command on the file systems supported in Hadoop.

$ hadoop
Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:

With the hdfs client you have the possibility to define a directory where configuration files can be located. So instead of changing the configuration files under /etc/hadoop they could be copied to some other location for reference. To make the client aware of both service ids the hdfs-site.xml would have to be adapted in a similar way assuming cluster 1 would be configured with serviceId1  and the other with serviceId2:

<configuration>
  
  <!-- services -->
  <property>
    <name>dfs.nameservices</name>  
    <value>serviceId1,serviceId2</value>
  </property>
  
  <!-- serviceId2 properties --> 
  <property>   
    <name>dfs.client.failover.proxy.provider.nameservices2</name> 
    <value>org.apache.hadoop.hdfs.server  
               .namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property> 
  <property> 
    <name>dfs.ha.namenodes.serviceId2</name>    
    <value>nn201,nn202</value> 
  </property>
  <property> 
    <name>dfs.namenode.rpc-address.serviceId2.nn201</name> 
    <value>nn201.pro.net:8020</value>
  </property> 
  <property> 
    <name>dfs.namenode.servicerpc-address.serviceId2.nn201</name> 
    <value>nn201.pro.net:54321</value> 
  </property>
  <property>
    <name>dfs.namenode.http-address.serviceId2.nn201</name> 
    <value>nn201.pro.net:50070</value>
  </property>
  <property> 
    <name>dfs.namenode.https-address.serviceId2.nn201</name> 
    <value>nn201.prod.com:50470</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.serviceId2.nn202</name> 
    <value>nn202.pro.net:8020</value>
  </property>
  <property>
    <name>dfs.namenode.servicerpc-address.serviceId2.nn202</name> 
    <value>nn202.pro.net:54321</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.serviceId2.nn202</name>
    <value>nn202.pro.net:50070</value>
  </property>
  <property> 
    <name>dfs.namenode.https-address.serviceId2.nn202</name>  
    <value>nn202.prod.net:50470</value>
  </property>

  <!—- serviceId1 -->
  <property>
    <name>dfs.client.failover.proxy.provider.nameservices1</name>     
      <value>org.apache.hadoop.hdfs.server.namenode.ha.          
                       ConfiguredFailoverProxyProvider</value> 
  </property> 
  <property> 
    <name>dfs.ha.namenodes.nameservices1</name> 
    <value>nn101,nn102</value>
  </property>
  <property>   
    <name>dfs.namenode.rpc-address.serviceId1.nn101</name> 
    <value>nn101.poc.net:8020</value>
  </property>
  <property>
    <name>dfs.namenode.servicerpc-address.serviceId1.nn101</name>   
    <value>nn101.poc.net:54321</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.serviceId1.nn101</name> 
    <value>nn101.poc.net:50070</value>
  </property>
  <property> 
    <name>dfs.namenode.https-address.serviceId1.nn101</name> 
    <value>nn101.poc.net:50470</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.serviceId1.nn102</name>
    <value>nn102.poc.net:8020</value>
  </property>
  <property>
    <name>dfs.namenode.servicerpc-address.serviceId1.nn102</name> 
    <value>nn102.poc.net:54321</value>
  </property>
  <property> 
    <name>dfs.namenode.http-address.serviceId1.nn102</name> 
     <value>nn102.poc.net:50070</value>
  </property>
  <property>
    <name>dfs.namenode.https-address.serviceId1.nn102</name> 
    <value>nn102.poc.net:50470</value>
  </property>
  …  
</configuration>

Further Reading

5 thoughts on “Distcp between two HA Cluster

  1. Hi Kropp,

    Thanks for the blog. I learnt a lot about disctp between two HA clusters. When I tried to implement, I am not able to initiate the distcp job. The job fails with the following error.

    16/03/14 04:22:17 ERROR tools.DistCp: Exception encountered
    java.io.IOException: Failed to run job : Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:cluster2-nameservice, Ident: (HDFS_DELEGATION_TOKEN token 189495 for hdfs)
    at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:536)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1306)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1303)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1303)
    at org.apache.hadoop.tools.DistCp.execute(DistCp.java:162)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:401)

    We have two clusters cluster1-nameservice and cluster2-nameservice. Both the clusters have HA configured and are secured.

    Thanks
    Guru

    Like

    1. I am not sure. Are both clusters using the same REALM/AD domain? You are executing the copy with the hdfs user? On what cluster do you execute the copy with what user credentials?

      Like

  2. Hi Kropp

    Thanks so much for a good article about distcp between 2 HA clusters. We have a requirement to move data periodically between 2 CDH 5.X.X clusters which are both secure , HA and also in same Realm .
    I had some follow up questions .
    1. Quoting from your post- “Obviously the easiest way to use distcp between two HA clusters would be to identify the current active NameNode and run distcp like you would with two clusters without HA” , if using a bash script we know the current active namenodes for the local and remote clusters , would just the single distcp statement suffice ? ( eliminating the need to build the merged config file )

    2. Once distcp is done using the hdfs:// protocol between local and remote cluster, how can we secure the network traffic between the clusters while distcp is copying the files ?

    Thank you so much again for a very helpful article.

    Regards
    VK

    Like

Leave a comment