Friday 29 July 2016

Cloudera Manager Health Test Issue Warning and Solution !

Cloudera Manager perform Multiple health test at regular Interval to Check Health of all Hadoop & Related Services.If any Health Test Failes it Show status of that service as Red.To Make All Services Like NameNode,DataNode,NodeManger,ResourceManager,Zookeeper etc its better aproach to fix all issue which are raised by health test.

  • How to Open Cloudera manager in Window Laptop or desktop browser?

   For Non Secure access Cloudera Manager URL is available on Port  7180
       http://<<ClouderaManagerServer IP>>:7180/cmf/home

   For Secure TLS enabled,Cloudera Manager URL is available on Port  7183
       https://<<ClouderaManagerServer IP>>:7183/cmf/home

  • How to Check Cloudera Manager Health Test  
         Login to Cloudera manager and goto All Health Issues Tab



1)Clock Offset

Description:The host's NTP service could not be located or did not respond to a request for the clock offset.
Solution:
                  -
                 -Identify NTP Server IP or Get details of NTP Server IP for your hadoop Cluster
                 -Login as root user
                 -On your Hadoop Cluster Nodes Edit-> /etc/ntp.conf
                 -Add entry-> "server <NTP Server IP>"
                 -Run "Service ntpd restart ",Restart Cluster From Cloudera Manager
                Note: If Problem Still Persist .Reboot you Hadoop Nodes & Check Process.


2)DNS Resolution

Description:Bad health issue
The hostname and canonical name for this host are not consistent when checked from a Java process.
Change Hostname and Canonical Name Health Check for all hosts
Solution:
                  -Vi 
                 -On your Hadoop Cluster Nodes Edit-> /etc/sysconfig/network
                 -Replace HOSTNAME="<hostname>" to -> Replace HOSTNAME="<FQDN>" 
                 -Eg HOSTNAME=clouderanamenode to HOSTNAME=clouderanamenode.xyz.com

                  -Reboot you Hadoop Nodes & Check Process.



3)Data Directory Status

Bad : The DataNode has 1 volume failure(s). Critical threshold: any.
Test of whether the DataNode has volume failures
         

Saturday 23 July 2016

Cloudera Manager & Kerbrose Security Common Issue !!

If the Generate Credentials command has succeeded, but CDH services fail to start !


  • Check the CDH  Server or Agent logs for any errors associated with keytab generation or information about the problems.
  • Before Enable Kerbros Security Make Sure all Service on Cloudera Manager are Green and All services are working.Ensure all Health Tests are Green.Let CM Run all services for few minutes so they are stable
  • Check that the encryption types are matched between your KDC and /etc/krb5.conf on all hosts.For Beginer Generally this Issue came.
  • If You are Using AES-256 Encryption, Install the JCE Policy File to deploy the JCE policy file on all hosts.Refer Cloudera Documentation for details


If HDFS tests (canary, connection) fail after enabling security !

  • Once you enable kerberos, you have to restart the CDH MGMT services so they can get a proper keytab.  Restart the Cloudera MGMT services  That way they can run the canary tests.

If you get CDH daemon logs error:: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Fail to create credential. (63) - No service creds)]


  • This error occurs because the ticket message is too large for the default UDP protocol.
  • Force Kerberos to use TCP instead of UDP by adding the following parameter to [libdefaults] in the krb5.conf file on the client(s) where the problem is occurring.
  • Example
  • [libdefaults]
       udp_preference_limit = 1

Note :If you choose to manage krb5.conf through Cloudera Manager, this will automatically get added to krb5.conf.


Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

When hbase shell throw 

        at org.jruby.Main.run(Main.java:208)
        at org.jruby.Main.main(Main.java:188)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
        at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
        at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
        at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
        at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
        ... 242 more


ERROR: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

1)Check hbase user ssh login status from /etc/password
2)Enabled login for hbase user
3) Regenerate Credential .This trick work for me


"Permission denied" errors in CDH
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=myuser, access=EXECUTE, inode="/user":hdfs:supergroup:drwxrwx---

"Permission denied" errors can present in a variety of use cases and from nearly any application that utilizes CDH.

Access to the HDFS filesystem and/or permissions on certain directories are not correctly configured.
The /user/ directory is owned by "hdfs" Check execute permission of /user/ .
If its tampered or changed.This is reason myuser is not able to execute mapred job on drwxrwx---

inode="/user":hdfs:supergroup:drwxrwx---

Solution :login as hdfs user and Change Permission of /user folder 
hadoop fs -chmod  -R 755 /user
Now myuser can execute MR job