Connecting Livy to a Secured Kerberized HDP Cluster

hkropp Spark, Uncategorized, Zeppelin November 6, 2016 5 Minutes

Livy.io is a proxy service for Apache Spark that allows to reuse an existing remote SparkContext among different users. By sharing the same context Livy provides an extended multi-tenant experience with users being capable of sharing RDDs and YARN cluster resources effectively.

In summary Livy uses a RPC architecture to extend the created SparkContext with a RPC service. Through this extension the existing context can be controlled and shared remotely by other users. On top of this Livy introduces authorization together with enhanced session management.

livy-architecture

Analytic applications like Zeppelin can use Livy to offer multi-tenant spark access in a controlled manner.

This post discusses setting up Livy with a secured HDP cluster.

As a long running service one of the requirements to connect Livy to a secured HDP cluster is the existence of a service principal. This service principals has to be readable by the livy user, as well as the hive principal for the the HiveContext.

Livy requires that this service principal is configured with a couple of different parameters, namely livy.server.launch.kerberos.[principal|keytab] and livy.server.auth.kerberos.[principal|keytab]. Also livy.server.auth.type needs to be set to kerberos.

livy.impersonation.enabled = true
livy.server.auth.type = kerberos
livy.server.launch.kerberos.principal = livy/node1.hdp@HDP.CORP
livy.server.launch.kerberos.keytab = /etc/security/keytabs/livy.service.keytab
livy.server.auth.kerberos.principal = HTTP/node1.hdp@HDP.CORP
livy.server.auth.kerberos.keytab = /etc/security/keytabs/spnego.service.keytab

This livy.server.auth.type will also set authentication for the Livy server itself. For example to configure Zeppelin with authentication for Livy you need to set the following in the interpreter settings:

"zeppelin.livy.principal": "zeppelin/node1.hdp@HDP.CORP",
"zeppelin.livy.keytab": "/etc/security/keytabs/zeppelin.service.keytab",

The launch parameters are used during startup:

export SPARK_HOME=/usr/hdp/current/spark-client
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
export PATH=/usr/lib/jvm/java-1.8.0-openjdk/bin:$PATH
export HADOOP_CONF_DIR=/etc/hadoop/conf
export LIVY_SERVER_JAVA_OPTS="-Xmx2g"

Kinit is not required with 0.3 of Livy, which is the version being used here.

With livy 0.2 it is required to kinit the livy user before starting the web-service:

$ su livy
$ kinit -kt /etc/security/keytabs/livy.service.keytab 
  livy/node1.hdp@HDP.CORP
$ bin/livy-server start

Authorization

With authentication enabled setting authorization will likely be required. For this Livy provides access control settings to control which users have access to the resources.

livy.server.access_control.enabled = true
livy.server.access_control.users = livy,zeppelin

Further for services like Zepplin impersonation settings are required. In order for the zeppelin user to be able to impersonate other users it requires to be a super user.

livy.superusers=zeppelin

HiveContext

If you have issues with the HiveContext being steup getting similar exceptions like this:

INFORMATION: 16/11/05 18:20:35 INFO metastore: Trying to connect to metastore with URI thrift://node1.hdp:9083
INFORMATION: 16/11/05 18:20:35 ERROR TSaslTransport: SASL negotiation failure
INFORMATION: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
INFORMATION:    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
INFORMATION:    at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
INFORMATION:    at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
INFORMATION:    at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
INFORMATION:    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
INFORMATION:    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
INFORMATION:    at java.security.AccessController.doPrivileged(Native Method)
INFORMATION:    at javax.security.auth.Subject.doAs(Subject.java:422)
INFORMATION:    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
INFORMATION:    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
INFORMATION:    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420)
INFORMATION:    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
INFORMATION:    at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
INFORMATION:    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
INFORMATION:    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
INFORMATION:    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
INFORMATION:    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
INFORMATION:    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
INFORMATION:    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
INFORMATION:    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
INFORMATION:    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
INFORMATION:    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
INFORMATION:    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
INFORMATION:    at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
INFORMATION:    at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
INFORMATION:    at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
INFORMATION:    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
INFORMATION:    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204)
INFORMATION:    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
INFORMATION:    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
INFORMATION:    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
INFORMATION:    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
INFORMATION:    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:249)
INFORMATION:    at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:345)
INFORMATION:    at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255)
INFORMATION:    at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:459)
INFORMATION:    at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:233)
INFORMATION:    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:236)
INFORMATION:    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
INFORMATION:    at com.cloudera.livy.repl.SparkInterpreter$$anonfun$start$1.apply(SparkInterpreter.scala:95)
INFORMATION:    at com.cloudera.livy.repl.SparkInterpreter$$anonfun$start$1.apply(SparkInterpreter.scala:82)
INFORMATION:    at com.cloudera.livy.repl.SparkInterpreter.restoreContextClassLoader(SparkInterpreter.scala:305)
INFORMATION:    at com.cloudera.livy.repl.SparkInterpreter.start(SparkInterpreter.scala:82)
INFORMATION:    at com.cloudera.livy.repl.Session$$anonfun$1.apply(Session.scala:59)
INFORMATION:    at com.cloudera.livy.repl.Session$$anonfun$1.apply(Session.scala:57)
INFORMATION:    at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
INFORMATION:    at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
INFORMATION:    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
INFORMATION:    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
INFORMATION:    at java.lang.Thread.run(Thread.java:745)
INFORMATION: Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
INFORMATION:    at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
INFORMATION:    at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
INFORMATION:    at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
INFORMATION:    at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
INFORMATION:    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
INFORMATION:    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
INFORMATION:    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
INFORMATION:    ... 49 more
INFORMATION: 16/11/05 18:20:35 WARN metastore: Failed to connect to the MetaStore Server...
INFORMATION: 16/11/05 18:20:35 INFO metastore: Waiting 1 seconds before next connection attempt.

You can either remove or make sure you use the correct hive-site.xml under /usr/hdp/current/spark-client/conf. Just copy it from /etc/hive/conf/:

$ cp /etc/hive/conf/hive-site.xml /usr/hdp/current/spark-client/conf/

It should also be able to disable the HiveContext completely by setting livy.repl.enableHiveContext to false.

livy.repl.enableHiveContext = false

Published by hkropp

View all posts by hkropp

Published November 6, 2016

12 thoughts on “Connecting Livy to a Secured Kerberized HDP Cluster”

richardstartin says:

November 16, 2016 at 8:22 pm

How does Livy proxy the user? Per task? Do you know how quotas are assigned to users, like how do you stop one Livy user from using all of the resources available to the Executors?

LikeLike

Reply
1. Henning Kropp says:
  
  November 17, 2016 at 2:53 pm
  
  Hi and thanks for your feedback.
  
  Livy uses SparkSubmit with --proxy-user NAME to proxy users.
  
  LikeLike
  
  Reply
Danish Khan says:

January 16, 2017 at 11:57 pm

Thanks for this post. I am trying to setup Zeppelin/Livy/Spark. All these are running on the same machine. My end goal is to be able to run Zeppelin/Livy/Spark with impersonation. So far, I have successfully configured Zeppelin with Spark. However, I want to use multi-tenancy, and for that I wanted to configure Zeppelin with Livy and Spark.

For Livy, I provided the following two paths
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export HADOOP_CONF_DIR=/etc/hadoop/conf

With the above settings, I can run the following command successfully in Zeppelin:
%livy.spark
sc.version

However, the following command fails:
%livy.sql
select * from myDB.table1

I see the following error:
:14: error: not found: value sqlContext
sqlContext.sql(“select * from datalake.combination2”).show(1000)

I have not enabled Shiro authentication for Zeppelin yet. My assumption was that Livy would log into Spark using the default user as I provide the Spark home directory. Could you please point out how can I fix the above issue?

LikeLike

Reply
1. vivek says:
  
  February 2, 2017 at 7:45 am
  
  Hi, THanks for the post. I can start the Livy server with Kerberos enables. Butwhen I do a requests.post(host, headers, data) it throwas a Authentication required error. ANy help would be useful here
  
  LikeLike
  
  Reply
  1. Henning Kropp says:
    
    February 2, 2017 at 7:50 am
    
    There are two parts to this. You need SPNEGO auth with Livy or Livy auth to YARN/Hadoop is failing.
    The error logs would be useful to help.
    2nd kinit and use: curl –negotiate -u :
    Search for curl SPNEGO for details.
    
    LikeLike
  2. vivek says:
    
    February 3, 2017 at 6:44 am
    
    Hi,
    SPNEGO Auth seems to be working fie on the server(I presume)
    
    I start Livy after setting the following parameters
    livy.impersonation.enabled = true
    livy.server.auth.type = kerberos
    livy.server.launch.kerberos.principal = @.COM
    livy.server.launch.kerberos.keytab = /home/pathtokeytab/.keytab
    livy.server.auth.kerberos.principal = HTTP/@.COM
    livy.server.auth.kerberos.keytab = /home/pathtokeytab/.keytab
    
    The server logs give me this o/p.
    
    17/02/02 23:18:50 INFO StateStore$: Using BlackholeStateStore for recovery.
    17/02/02 23:18:50 INFO BatchSessionManager: Recovered 0 batch sessions. Next session id: 0
    17/02/02 23:18:50 INFO InteractiveSessionManager: Recovered 0 interactive sessions. Next session id: 0
    17/02/02 23:18:50 INFO LivyServer: SPNEGO auth enabled (principal = HTTP/@.COM)
    17/02/02 23:18:51 INFO KerberosAuthenticationHandler: Login using keytab /home/pathtokeytab/.keytab, for principal HTTP/@.COM
    17/02/02 23:18:51 WARN RequestLogHandler: !RequestLog
    17/02/02 23:18:51 INFO WebServer: Starting server on http://:8998
    
    This is running in the US Data center. I believe this means that Liy has started successfulyy with Kerberos
    
    Now from the client machine In Singapore I run the below commands
    
    c:FASTPython2.7.12>python
    Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (
    AMD64)] on win32
    Type “help”, “copyright”, “credits” or “license” for more information.
    >>> import requests
    >>> import json
    >>> from requests_kerberos import HTTPKerberosAuth, REQUIRED
    >>> headers = {‘Content-Type’: ‘application/json’}
    >>> krb = HTTPKerberosAuth(mutual_authentication=REQUIRED, sanitize_mutual_error_response=False)
    >>> r = requests.post(‘http://:8998/sessions’, headers=headers, auth=krb)
    >>> r.raise_for_status()
    Traceback (most recent call last):
    File “”, line 1, in
    File “c:FASTPython2.7.12libsite-packagesrequests-2.11.1-py2.7.eggreques
    tsmodels.py”, line 883, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails) for url: http://:8998/sessions
    >>>
    
    LikeLike
  3. Henning Kropp says:
    
    February 3, 2017 at 7:15 am
    
    Hi, very interesting. I can’t remember, if I ever did this with Python like that.
    
    Your issue is “pretty straightforward” as you can see in the error message:
    “403 Client Error: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails) for url: http://:8998/sessions”
    
    You are having issues obtaining proper Kerberos credentials on your machine. Now this again could be related to multiple aspects around your current setup.
    
    The most common reason people getting this error is, when they have improper access rights set on the keytab, so that the executing user is not able to read it.
    
    In your case, where is the KDC Realm? In US? Can you access it from your location? What are your krb5 confs on your local machine? Are you sharing the same Realm?
    
    Put simple, your authentication on your local machine is not working properly. Try to enable debug logs for Kerberos and can you try to curl from a machine in the US DC?
    
    LikeLike
  4. vivek says:
    
    February 3, 2017 at 1:41 pm
    
    Hi,
    The KDC realm is in the US.
    I do have an account in the US domain where the hadoop servers reside, and if i try to do a kinit from my local machine with the full path which incudes the KDC realm of the US domain it does generate the kerberos cache but I cant seem to figure out where the Keytab file is. I cant seem to find the Keytab file that resides on the US domain too qhen i run the kinit there. I have raised this with my engg team. curl command with negotiate fails for me with the 401 authentication required error. Funnily if i run the same command on the browser it goes through, engg thinks its because the browsers authenticate to the US domain as opposed to the curl command which authenticates to the Asia domain in the firm
    But whats really perplexing is if i use the hdfs.ext.kerberos python library and use kerberos auth it authentcates correctly.
    Nevertheless thanks for your original post and your immediate responses to my comments. Ill keep you posted on what engg comes back with
    
    LikeLike
vivek says:

February 3, 2017 at 6:45 am

http://:8998 -> http://:8998
http://:8998/sessions -> http://:8998/sessions

LikeLike

Reply
1. vivek says:
  
  February 3, 2017 at 6:48 am
  
  Please note the livy conf settings
  
  livy.impersonation.enabled = true
  livy.server.auth.type = kerberos
  livy.server.launch.kerberos.principal = @domain.COM
  livy.server.launch.kerberos.keytab = /home/rc/.keytab
  livy.server.auth.kerberos.principal = HTTP/@domain.COMM
  livy.server.auth.kerberos.keytab = /home/rc/.keytab
  
  LikeLike
  
  Reply
  1. Vivek says:
    
    February 17, 2017 at 5:17 am
    
    Hi,
    
    So the kerberos issue got resolved.
    I was using the wrong SPNEGO principal and keytab. Once I got that regenerated and applied kerberos auth name rules correctly to remove my domain name (SGP domain) before the kerberos auth is called, it worked fine and successfully authenticated.
    
    Issue now is if i turn on impersonation. It fails with the error User not authorised in Livy.
    
    LikeLike
Maikel Penz says:

July 15, 2019 at 4:28 am

Hey. Great post about Livy and kerberos.
I have a kerberized EMR cluster running on AWS that comes with most settings around kerberos and livy pre-configured.

I created a principal called “dataengineering” and I can programatically hit Livy if:
– I ssh to the server
– kinit dataengineering
– then call a python script passing the dataengineering principal.

However, I am struggling to access the Livy UI through the browser. I get a “HTTP ERROR: 403”

output from “/var/log/livy/livy-livy-server.out”
WARN AuthenticationFilter: AuthenticationToken ignored: Unauthorized access

Livy configuration file (/usr/lib/livy/conf/livy.conf)

livy.impersonation.enabled = true
livy.superusers = dataengineering,livy,HTTP
livy.server.auth.type = kerberos
livy.server.launch.kerberos.principal = livy/@EC2.INTERNAL
livy.server.launch.kerberos.keytab = /etc/livy.keytab
livy.server.auth.kerberos.principal = HTTP/@EC2.INTERNAL
livy.server.auth.kerberos.keytab = /etc/livy.keytab

EMR comes with entries for both HTTP and livy on “/etc/livy.keytab”

I am unsure how the browser handles it and as per the configuration above plus the logs I believe a principal called “HTTP” is used. I can see this line in the logs LivyServer: SPNEGO auth enabled (principal = HTTP/@EC2.INTERNAL).

Have you ever faced this issue ?

LikeLike

Reply

Connecting Livy to a Secured Kerberized HDP Cluster

Authorization

HiveContext

Further Readings

Published by hkropp

12 thoughts on “Connecting Livy to a Secured Kerberized HDP Cluster”

Leave a comment Cancel reply

Authorization

HiveContext

Further Readings

Share this:

Related

Published by hkropp

12 thoughts on “Connecting Livy to a Secured Kerberized HDP Cluster”

Leave a comment Cancel reply