Kerberized Hadoop Cluster – A Sandbox Example

The groundwork of any secure system installation is a strong authentication. It is the process of verifying the identity of a user by comparing known factors. Factors can be:

  1. Shared Knowledge
    A password or the answer to a question. It’s the most common and not seldom the only factor used by computer systems today.
  2. Biometric Attributes
    For example fingerprints or iris pattern
  3. Items One Possess
    A Smart Card or phone. Phone is probably one of the most common factors in use today aside a shared knowledge.

A system that takes more than one factor into account for authentication is also know as a multi-factor authentication system. Knowing the identity of a user up to a specific certainty can not be overestimated.

All other components of a save environment, like Authorization, Audit, Data Protection, and Administration, heavily rely on a strong authentication. Authorization or Auditing only make sense if the identity of a user can not be compromised. In Hadoop today there exist solution for nearly all aspects of enterprise grade security layers, especially with the event of Apache Argus. Continue reading “Kerberized Hadoop Cluster – A Sandbox Example”

Advertisement

Hadoop Security: 10 Resources To Get You Started

As Hadoop emerges into the center of todays enterprise data architecture, security becomes a critical requirement. This can be witnessed by the most recent acquisitions of leading Hadoop vendors and also by the numerous projects centered around security that have been launched or are getting more traction recently.

Here are 10 resources to get you started about the topic:

  1. Hadoop Security Design (2009 White Paper)
  2. Hadoop Security Design? – Just Add Kerberos? Really?(Black Hat 2010)
  3. Hadoop Poses a Big Data Security Risk: 10 Reasons Why
  4. Apache Knox – A gateway for Hadoop clusters
  5. Apache Argus
  6. Project Rhino
  7. Protegrity Big Data Protector
  8. Dataguise for Hadoop
  9. Secure JDBC and ODBC Clients’ Access to HiveServer2
  10. InfoSphere Optim Data Masking

Further Readings

Using Self-Signed Certificates with Java and Maven

JAVA applications using JSSE (Java Secure Socket Extension) can’t connect to servers with self-signed or untrusted certificates by default. Maven for example is not able to download required dependencies from a nexus server, if that uses a self-signed certificate or the certificate authority is not recognized. If you try to connect to a server of that kind a security ValidatorException will be thrown:

Error transferring file: sun.security.validator.ValidatorException:
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException:
unable to find valid certification path to requested target