Livy.io is a proxy service for Apache Spark that allows to reuse an existing remote SparkContext among different users. By sharing the same context Livy provides an extended multi-tenant experience with users being capable of sharing RDDs and YARN cluster resources effectively.
In summary Livy uses a RPC architecture to extend the created SparkContext with a RPC service. Through this extension the existing context can be controlled and shared remotely by other users. On top of this Livy introduces authorization together with enhanced session management.
Analytic applications like Zeppelin can use Livy to offer multi-tenant spark access in a controlled manner.
This post discusses setting up Livy with a secured HDP cluster.
As a long running service one of the requirements to connect Livy to a secured HDP cluster is the existence of a service principal. This service principals has to be readable by the livy user, as well as the hive principal for the the HiveContext.
Livy requires that this service principal is configured with a couple of different parameters, namely livy.server.launch.kerberos.[principal|keytab] and livy.server.auth.kerberos.[principal|keytab]. Also livy.server.auth.type needs to be set to kerberos.
livy.impersonation.enabled = true livy.server.auth.type = kerberos livy.server.launch.kerberos.principal = livy/node1.hdp@HDP.CORP livy.server.launch.kerberos.keytab = /etc/security/keytabs/livy.service.keytab livy.server.auth.kerberos.principal = HTTP/node1.hdp@HDP.CORP livy.server.auth.kerberos.keytab = /etc/security/keytabs/spnego.service.keytab
This livy.server.auth.type will also set authentication for the Livy server itself. For example to configure Zeppelin with authentication for Livy you need to set the following in the interpreter settings:
"zeppelin.livy.principal": "zeppelin/node1.hdp@HDP.CORP", "zeppelin.livy.keytab": "/etc/security/keytabs/zeppelin.service.keytab",
The launch parameters are used during startup:
export SPARK_HOME=/usr/hdp/current/spark-client export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk export PATH=/usr/lib/jvm/java-1.8.0-openjdk/bin:$PATH export HADOOP_CONF_DIR=/etc/hadoop/conf export LIVY_SERVER_JAVA_OPTS="-Xmx2g"
Kinit is not required with 0.3 of Livy, which is the version being used here.
With livy 0.2 it is required to kinit the livy user before starting the web-service:
$ su livy $ kinit -kt /etc/security/keytabs/livy.service.keytab livy/node1.hdp@HDP.CORP $ bin/livy-server start
Authorization
With authentication enabled setting authorization will likely be required. For this Livy provides access control settings to control which users have access to the resources.
livy.server.access_control.enabled = true livy.server.access_control.users = livy,zeppelin
Further for services like Zepplin impersonation settings are required. In order for the zeppelin user to be able to impersonate other users it requires to be a super user.
livy.superusers=zeppelin
HiveContext
If you have issues with the HiveContext being steup getting similar exceptions like this:
INFORMATION: 16/11/05 18:20:35 INFO metastore: Trying to connect to metastore with URI thrift://node1.hdp:9083 INFORMATION: 16/11/05 18:20:35 ERROR TSaslTransport: SASL negotiation failure INFORMATION: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] INFORMATION: at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) INFORMATION: at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) INFORMATION: at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) INFORMATION: at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) INFORMATION: at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) INFORMATION: at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) INFORMATION: at java.security.AccessController.doPrivileged(Native Method) INFORMATION: at javax.security.auth.Subject.doAs(Subject.java:422) INFORMATION: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) INFORMATION: at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) INFORMATION: at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) INFORMATION: at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236) INFORMATION: at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) INFORMATION: at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) INFORMATION: at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) INFORMATION: at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) INFORMATION: at java.lang.reflect.Constructor.newInstance(Constructor.java:423) INFORMATION: at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) INFORMATION: at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) INFORMATION: at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) INFORMATION: at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) INFORMATION: at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) INFORMATION: at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) INFORMATION: at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) INFORMATION: at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) INFORMATION: at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166) INFORMATION: at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) INFORMATION: at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204) INFORMATION: at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) INFORMATION: at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) INFORMATION: at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) INFORMATION: at java.lang.reflect.Constructor.newInstance(Constructor.java:423) INFORMATION: at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:249) INFORMATION: at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:345) INFORMATION: at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255) INFORMATION: at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:459) INFORMATION: at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:233) INFORMATION: at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:236) INFORMATION: at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101) INFORMATION: at com.cloudera.livy.repl.SparkInterpreter$$anonfun$start$1.apply(SparkInterpreter.scala:95) INFORMATION: at com.cloudera.livy.repl.SparkInterpreter$$anonfun$start$1.apply(SparkInterpreter.scala:82) INFORMATION: at com.cloudera.livy.repl.SparkInterpreter.restoreContextClassLoader(SparkInterpreter.scala:305) INFORMATION: at com.cloudera.livy.repl.SparkInterpreter.start(SparkInterpreter.scala:82) INFORMATION: at com.cloudera.livy.repl.Session$$anonfun$1.apply(Session.scala:59) INFORMATION: at com.cloudera.livy.repl.Session$$anonfun$1.apply(Session.scala:57) INFORMATION: at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) INFORMATION: at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) INFORMATION: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) INFORMATION: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) INFORMATION: at java.lang.Thread.run(Thread.java:745) INFORMATION: Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) INFORMATION: at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) INFORMATION: at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) INFORMATION: at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) INFORMATION: at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) INFORMATION: at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) INFORMATION: at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) INFORMATION: at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) INFORMATION: ... 49 more INFORMATION: 16/11/05 18:20:35 WARN metastore: Failed to connect to the MetaStore Server... INFORMATION: 16/11/05 18:20:35 INFO metastore: Waiting 1 seconds before next connection attempt.
You can either remove or make sure you use the correct hive-site.xml under /usr/hdp/current/spark-client/conf. Just copy it from /etc/hive/conf/:
$ cp /etc/hive/conf/hive-site.xml /usr/hdp/current/spark-client/conf/
It should also be able to disable the HiveContext completely by setting livy.repl.enableHiveContext to false.
livy.repl.enableHiveContext = false
How does Livy proxy the user? Per task? Do you know how quotas are assigned to users, like how do you stop one Livy user from using all of the resources available to the Executors?
LikeLike
Hi and thanks for your feedback.
Livy uses SparkSubmit with
--proxy-user NAME
to proxy users.LikeLike
Thanks for this post. I am trying to setup Zeppelin/Livy/Spark. All these are running on the same machine. My end goal is to be able to run Zeppelin/Livy/Spark with impersonation. So far, I have successfully configured Zeppelin with Spark. However, I want to use multi-tenancy, and for that I wanted to configure Zeppelin with Livy and Spark.
For Livy, I provided the following two paths
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
With the above settings, I can run the following command successfully in Zeppelin:
%livy.spark
sc.version
However, the following command fails:
%livy.sql
select * from myDB.table1
I see the following error:
:14: error: not found: value sqlContext
sqlContext.sql(“select * from datalake.combination2”).show(1000)
I have not enabled Shiro authentication for Zeppelin yet. My assumption was that Livy would log into Spark using the default user as I provide the Spark home directory. Could you please point out how can I fix the above issue?
LikeLike
Hi, THanks for the post. I can start the Livy server with Kerberos enables. Butwhen I do a requests.post(host, headers, data) it throwas a Authentication required error. ANy help would be useful here
LikeLike
There are two parts to this. You need SPNEGO auth with Livy or Livy auth to YARN/Hadoop is failing.
The error logs would be useful to help.
2nd kinit and use: curl –negotiate -u :
Search for curl SPNEGO for details.
LikeLike
Hi,
SPNEGO Auth seems to be working fie on the server(I presume)
I start Livy after setting the following parameters
livy.impersonation.enabled = true
livy.server.auth.type = kerberos
livy.server.launch.kerberos.principal = @.COM
livy.server.launch.kerberos.keytab = /home/pathtokeytab/.keytab
livy.server.auth.kerberos.principal = HTTP/@.COM
livy.server.auth.kerberos.keytab = /home/pathtokeytab/.keytab
The server logs give me this o/p.
17/02/02 23:18:50 INFO StateStore$: Using BlackholeStateStore for recovery.
17/02/02 23:18:50 INFO BatchSessionManager: Recovered 0 batch sessions. Next session id: 0
17/02/02 23:18:50 INFO InteractiveSessionManager: Recovered 0 interactive sessions. Next session id: 0
17/02/02 23:18:50 INFO LivyServer: SPNEGO auth enabled (principal = HTTP/@.COM)
17/02/02 23:18:51 INFO KerberosAuthenticationHandler: Login using keytab /home/pathtokeytab/.keytab, for principal HTTP/@.COM
17/02/02 23:18:51 WARN RequestLogHandler: !RequestLog
17/02/02 23:18:51 INFO WebServer: Starting server on http://:8998
This is running in the US Data center. I believe this means that Liy has started successfulyy with Kerberos
Now from the client machine In Singapore I run the below commands
c:FASTPython2.7.12>python
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (
AMD64)] on win32
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import requests
>>> import json
>>> from requests_kerberos import HTTPKerberosAuth, REQUIRED
>>> headers = {‘Content-Type’: ‘application/json’}
>>> krb = HTTPKerberosAuth(mutual_authentication=REQUIRED, sanitize_mutual_error_response=False)
>>> r = requests.post(‘http://:8998/sessions’, headers=headers, auth=krb)
>>> r.raise_for_status()
Traceback (most recent call last):
File “”, line 1, in
File “c:FASTPython2.7.12libsite-packagesrequests-2.11.1-py2.7.eggreques
tsmodels.py”, line 883, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails) for url: http://:8998/sessions
>>>
LikeLike
Hi, very interesting. I can’t remember, if I ever did this with Python like that.
Your issue is “pretty straightforward” as you can see in the error message:
“403 Client Error: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos credentails) for url: http://:8998/sessions”
You are having issues obtaining proper Kerberos credentials on your machine. Now this again could be related to multiple aspects around your current setup.
The most common reason people getting this error is, when they have improper access rights set on the keytab, so that the executing user is not able to read it.
In your case, where is the KDC Realm? In US? Can you access it from your location? What are your krb5 confs on your local machine? Are you sharing the same Realm?
Put simple, your authentication on your local machine is not working properly. Try to enable debug logs for Kerberos and can you try to curl from a machine in the US DC?
LikeLike
Hi,
The KDC realm is in the US.
I do have an account in the US domain where the hadoop servers reside, and if i try to do a kinit from my local machine with the full path which incudes the KDC realm of the US domain it does generate the kerberos cache but I cant seem to figure out where the Keytab file is. I cant seem to find the Keytab file that resides on the US domain too qhen i run the kinit there. I have raised this with my engg team. curl command with negotiate fails for me with the 401 authentication required error. Funnily if i run the same command on the browser it goes through, engg thinks its because the browsers authenticate to the US domain as opposed to the curl command which authenticates to the Asia domain in the firm
But whats really perplexing is if i use the hdfs.ext.kerberos python library and use kerberos auth it authentcates correctly.
Nevertheless thanks for your original post and your immediate responses to my comments. Ill keep you posted on what engg comes back with
LikeLike
http://:8998 -> http://:8998
http://:8998/sessions -> http://:8998/sessions
LikeLike
Please note the livy conf settings
livy.impersonation.enabled = true
livy.server.auth.type = kerberos
livy.server.launch.kerberos.principal = @domain.COM
livy.server.launch.kerberos.keytab = /home/rc/.keytab
livy.server.auth.kerberos.principal = HTTP/@domain.COMM
livy.server.auth.kerberos.keytab = /home/rc/.keytab
LikeLike
Hi,
So the kerberos issue got resolved.
I was using the wrong SPNEGO principal and keytab. Once I got that regenerated and applied kerberos auth name rules correctly to remove my domain name (SGP domain) before the kerberos auth is called, it worked fine and successfully authenticated.
Issue now is if i turn on impersonation. It fails with the error User not authorised in Livy.
LikeLike
Hey. Great post about Livy and kerberos.
I have a kerberized EMR cluster running on AWS that comes with most settings around kerberos and livy pre-configured.
I created a principal called “dataengineering” and I can programatically hit Livy if:
– I ssh to the server
– kinit dataengineering
– then call a python script passing the dataengineering principal.
However, I am struggling to access the Livy UI through the browser. I get a “HTTP ERROR: 403”
output from “/var/log/livy/livy-livy-server.out”
WARN AuthenticationFilter: AuthenticationToken ignored: Unauthorized access
Livy configuration file (/usr/lib/livy/conf/livy.conf)
livy.impersonation.enabled = true
livy.superusers = dataengineering,livy,HTTP
livy.server.auth.type = kerberos
livy.server.launch.kerberos.principal = livy/@EC2.INTERNAL
livy.server.launch.kerberos.keytab = /etc/livy.keytab
livy.server.auth.kerberos.principal = HTTP/@EC2.INTERNAL
livy.server.auth.kerberos.keytab = /etc/livy.keytab
EMR comes with entries for both HTTP and livy on “/etc/livy.keytab”
I am unsure how the browser handles it and as per the configuration above plus the logs I believe a principal called “HTTP” is used. I can see this line in the logs LivyServer: SPNEGO auth enabled (principal = HTTP/@EC2.INTERNAL).
Have you ever faced this issue ?
LikeLike