Validate Java Keystore on BDA

In many projects, I need to create a keystore to store SSL certifications. Majority of times I hardly worry about the validity of a Keystore. My keystores just works and I can see the content of all certifications by using keytool command. It works pretty well until recently when I needed to configure TLS for Cloudera Manager on BDA.
BDA has its own command to enable TLS for Cloudera Manager,Hue and Oozie in a pretty easy way. Just run command bdacli enable https_cm_hue_oozie. The only drawback for this command is that it is using self-signed certificate, not the users’ own certificates. Although it works good from security perspective, it’s not a good idea in the long run. I need to replace Oracle’s self-signed certificates with client’s certificates on BDA. Either Cloudera’s approach or Oracle’s approach is not going to work. Anyway, it is a different topic and I will discuss it in a different blog.

During my work to enable TLS with Cloudera Manager using client’s certificates, I run into various issues. After looking at many issues in detail, I suspect the key issue of my problem might come from the incorrectness of my keystore. Unfortunately to configure TLS with Cloudera Manager, agent and services, it requires to shut down CDH cluster and many steps to reach the stage I can test the keystore. It’s too time consuming for a busy BDA cluster. This blog is to discuss the approach to find a way, fast, easy and independent of CDH cluster to verify the content of a keystore is valid or not. Most importantly avoid the bridge building mistake shown below.

As my topic is related to BDA, I am going to list the ways to create a keystore in both Cloudera and Oracle ways.

Cloudera Way
See Cloudera’s document Step 1: Obtain Encryption Keys and Certificates for Cloudera Manager Server
I just highlight the key steps and commands as follows:
1. Generate Keystore for Cloudera Manager Host (Node 3 on BDA)

# keytool -genkeypair -alias cmhost -keyalg RSA -keystore \
/opt/cloudera/security/jks/cmhost-keystore.jks -keysize 2048 -dname \
"CN=cmhost.sec.example.com,OU=Security,O=Example,L=Denver,ST=Colorado,C=US" \
-storepass password -keypass password

2. Generate a CSR for the host.

# keytool -certreq -alias cmhost \
-keystore /opt/cloudera/security/jks/cmhost-keystore.jks \
-file /opt/cloudera/security/x509/cmhost.csr -storepass password \
-keypass password

3. Submit the .csr file created by the -certreq command to Certificate Authority to obtain a server certificate.
4. Copy the root CA certificate and any intermediate CA certificates to /opt/cloudera/security/CAcerts/.
There is no /opt/cloudera/security/CAcerts/ directory exist on BDA and I don’t believe it is necessary.
Actually I like Oracle approach, just copy the root and intermediate CA certificates to /opt/cloudera/security/jks directory. But I do like Cloudera’s approach to import root CA and intermediate CA certificates to the alternative system JDK truststore, jssecacerts, before importing them to the Java keystore on BDA. This is what Oracle’s approach is missing.

# cp $JAVA_HOME/jre/lib/security/cacerts $JAVA_HOME/jre/lib/security/jssecacerts

# keytool -importcert -alias RootCA -keystore $JAVA_HOME/jre/lib/security/jssecacerts \
-file /opt/cloudera/security/CAcerts/RootCA.cer -storepass changeit

# keytool -importcert -alias SubordinateCA -keystore \
$JAVA_HOME/jre/lib/security/jssecacerts \
-file /opt/cloudera/security/CAcerts/SubordinateCA.cer -storepass changeit

5. Import the root and intermediate certificates into keystore.

# keytool -importcert -trustcacerts -alias RootCA -keystore \
/opt/cloudera/security/jks/cmhost-keystore.jks -file \
/opt/cloudera/security/CAcerts/RootCA.cer -storepass password

# keytool -importcert -trustcacerts -alias SubordinateCA -keystore \ 
/opt/cloudera/security/jks/cmhost-keystore.jks -file \
/opt/cloudera/security/CAcerts/SubordinateCA.cer -storepass password

6. Import the signed host certificate

# cp certificate-file.cer  /opt/cloudera/security/x509/cmhost.pem

# keytool -importcert -trustcacerts -alias cmhost \ 
-file /opt/cloudera/security/x509/cmhost.pem \ 
-keystore /opt/cloudera/security/jks/cmhost-keystore.jks -storepass password

Oracle Way
See Oracle Note How to Use Certificates Signed by a User’s Certificate Authority for Web Consoles and Hadoop Network Encryption Use on the BDA (Doc ID 2187903.1)

1. Create the keystore on all nodes called /opt/cloudera/security/jks/node.jks
This is the place I like Oracle’s approach. Cloudera does require to have keystore in all hosts, but document in a way in separate chapters: Cloudera Manager and Agent. Only when I am done with the configuration, I realized why not combine them together in one single step. This is where Oracle’s approach is much simpler and easy.

# dcli -C keytool -validity 720 -keystore /opt/cloudera/security/jks/node.jks \
-alias \$HOSTNAME -genkeypair -keyalg RSA -storepass $PW -keypass $PW \
-dname "CN=\${HOSTNAME},OU=,O=,L=,S=,C="  

# dcli -C ls -l /opt/cloudera/security/jks/node.jks

2. Create CSR for each node.

# dcli -C keytool -keystore /opt/cloudera/security/jks/node.jks -alias \$HOSTNAME \
-certreq -file /root/\$HOSTNAME-cert-file -keypass $PW -storepass $PW 

3. Submit the node specific CSR to CA and signed.
4. Copy the signed certificate to cert_file_signed
cert_file_signed_bdanode01 would be copied to Node 1 as: /opt/cloudera/security/jks/cert_file_signed
cert_file_signed_bdanode02 would be copied to Node 2 as: /opt/cloudera/security/jks/cert_file_signed

cert_file_signed_bdanode0n would be copied to Node n as: /opt/cloudera/security/jks/cert_file_signed
5. Copy CA public certificate to /opt/cloudera/security/jks/ca.crt

# cp /tmp/staging/ca.crt /opt/cloudera/security/jks/ca.crt  
# dcli -C -f /opt/cloudera/security/jks/ca.crt -d /opt/cloudera/security/jks/ca.crt  
# dcli -C ls -ltr /opt/cloudera/security/jks/ca.crt

6. Import the CA public certificate /opt/cloudera/security/jks/ca.crt into the keystore on each node

# dcli -C keytool -keystore /opt/cloudera/security/jks/node.jks -alias CARoot \
-import -file /opt/cloudera/security/jks/ca.crt -storepass $PW -keypass $PW -noprompt

7. Import the signed certificate for each node on BDA

# dcli -C keytool -keystore /opt/cloudera/security/jks/node.jks -alias \$HOSTNAME \
-import -file /opt/cloudera/security/jks/cert_file_signed -storepass $PW -keypass $PW -noprompt 

So for TLS on BDA, the keystore file is /opt/cloudera/security/jks/node.jks. Another important file is Truststore at /opt/cloudera/security/jks/.truststore. The approach to build this file is quite similar as node.jks.

Ok, I have the node.jks file. How to verify it that it is a valid one? Like many people, I used to use keytool command to check out the content of keystore file. For example,

[root@enkx4bda1node01 ~]# keytool -list -v -keystore /opt/cloudera/security/jks/node.jks
Enter keystore password:  

*****************  WARNING WARNING WARNING  *****************
* The integrity of the information stored in your keystore  *
* has NOT been verified!  In order to verify its integrity, *
* you must provide your keystore password.                  *
*****************  WARNING WARNING WARNING  *****************

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: enkx4bda1node01.enkitec.local
Creation date: Mar 5, 2016
Entry type: PrivateKeyEntry
Certificate chain length: 1
Certificate[1]:
Owner: CN=enkx4bda1node01.enkitec.local, OU=, O=, L=, ST=, C=
Issuer: CN=enkx4bda1node01.enkitec.local, OU=, O=, L=, ST=, C=
Serial number: 26a1471b
Valid from: Sat Mar 05 02:17:40 CST 2016 until: Fri Feb 23 02:17:40 CST 2018
Certificate fingerprints:
	 MD5:  10B:30:3A:40:CD:94:38:7D:3A:33:1F:DD:49:B7:DF:99
	 SHA1: 98:6F:FC:84:68:BA:BD:25:37:8A:1B:D6:07:6F:FE:14:41:76:5B:09
	 SHA256: L3:43:4C:4C:9B:0E:36:18:DD:F1:10:84:46:9E:77:AA:BB:C7:85:E5:FC:19:4F:29:7F:70:BA:D4:0C:55:AD:F7
	 Signature algorithm name: SHA256withRSA
	 Version: 3

Extensions: 

#1: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: GH FD 23 C9 9A A3 28 F9   3D C5 3B 1E E7 97 49 4E  ......(.=.:...IN
0010: 12 69 27 D5                                        .i(.
]
]

*******************************************
*******************************************

It is usually works, but with certain limitations. Even the keystore has all the necessary certificates, if they are not in the right order, it might not be a valid one. As I suspect my keystore on BDA might not be a valid one, I tried to find other potential tools beyond keytool. Luckily, I found a blog Installing Trusted Certificates into a Java Keystore by Oracle’s Jim Connors. It’s a very nice blog about various tools for keystore. I am really interested in one of the tool he talked about: using weblogic.jar‘s ValidateCertChain program.

I happened to build an OEM Cloud Control 13c R2 environment. Ok, let me give it a try.

[root@enkx4bdacli02 tmp]# java -cp /u01/app/oracle/oem/wlserver/server/lib/weblogic.jar utils.ValidateCertChain -jks enkx4bda1node03.enkitec.com node.jks
Cert[0]: CN=enkx4bda1node03.enkitec.com,OU=Bigdata,O=Enkitec,L=Irving,ST=TX,C=US
Certificate chain is incomplete, can't confirm the entire chain is valid
Certificate chain appears valid

It indeed find something and tell me my certificate chain is incomplete. This gives me the clue to focus only on the steps in building keystore. After I figured out the issue and fixed the import sequence of certificates, rerun the command again. Here is the result:

[root@enkx4bdacli02 tmp]# java -cp /u01/app/oracle/oem/wlserver/server/lib/weblogic.jar utils.ValidateCertChain -jks enkx4bda1node03.enkitec.com node.jks
Cert[0]: CN=enkx4bda1node03.enkitec.com,OU=Bigdata,O=Enkitec,L=Irving,ST=TX,C=US
Cert[1]: CN=EnkLab Intermediate CA,OU=Bigdata,O=Enkitec,ST=Texas,C=US
Cert[2]: CN=EnkLab ROOT CA,OU=Bigdata,O=Enkitec,L=Irving,ST=TX,C=US
Certificate chain appears valid

Looks much better. It correctly shows there are one root certificate, one intemediate CA certificate, and one host certificate. This keystore is one of my major issues in building keystore on BDA.

There is another command, openssl s_client, to validate keystore, but only useful when everything is configured.

# openssl s_client -connect enkx4bda1node03.enkitec.com:7183 -CAfile root.enkitec.com.cert.pem
CONNECTED(00000003)
depth=2 C = US, ST = TX, L = Irving, O = Enkitec, OU = bigdata, CN = Enklab ROOT CA
verify return:1
depth=1 C = US, ST = TX, O = Enkitec, OU = bigdata, CN = Enklab Intermediate CA
verify return:1
depth=0 C = US, ST = TX, L = Irving, O = Enkitec, OU = bigdata, CN = enkx4bda1node03.enkitec.com
verify return:1
---
Certificate chain
 0 s:/C=US/ST=TX/L=Irving/O=Enkitec/OU=Bigdata/CN=enkx4bda1node03.enkitec.com
   i:/C=US/ST=TX/O=Enkitec/OU=Bigdata/CN=Bigdata Intermediate CA
 1 s:/C=US/ST=TX/O=Enkitec/OU=Bigdata/CN=Bigdata Intermediate CA
   i:/C=US/ST=TX/L=Irving/O=Enkitec/OU=Bigdata/CN=Bigdata ROOT CA
 2 s:/C=US/ST=TX/L=Irving/O=Enkitec/OU=Bigdata/CN=Bigdata ROOT CA
   i:/C=US/ST=TX/L=Irving/O=Enkitec/OU=Bigdata/CN=Bigdata ROOT CA
---
Server certificate
-----BEGIN CERTIFICATE-----

MIIDXTCCAkWgAwIBAgIEQn3HnzANBgkqhkiG9w0BAQsFADBfMQkwBwYDVQQGEwAx
CTAHBgNVBAgTADEJMAcGA1UEBxMAMQkwBwYDVQQKEwAxCTAHBgNVBAsTADEmMCQG
A1UEAxMdZW5reDRiZGExbm9kZTAzLmVua2l0ZWMubG9jYWwwHhcNMTYwMzA1MDgx
NzQ1WhcNMTgwMjIzMDgxNzQ1WjBfMQkwBwYDVQQGEwAxCTAHBgNVBAgTADEJMAcG
A1UEBxMAMQkwBwYDVQQKEwAxCTAHBgNVBAsTADEmMCQGA1UEAxMdZW5reDRiZGEx
bm9kZTAzLmVua2l0ZWMubG9jYWwwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQDXcThbyBV4FAm2EJJBhZpg5XLqRcswMm748QUxBzTBj+LeXZJw7wTX3SzJ
Eup6YeJKczDYTjPLpHZ6ruOnhz4WSA/39e+U9MvqNZMnwdwgA7/d++4BA4ZGWs1N
3G/NmYHR1eKJntPFrExz/1XSJpW7xVfAaNsQNUb9HkAEtXN25GOF/H7jQBwxx5Wq
mnIZAgNC7shg6DCusvaURllsOih+XY4kf8HYKLLihXUmbeNauG/ixZyXm3kKu5mN
vfXF48Y4OKMHkYMS5BfZzaRw43+PWIWPbsy2RR+GRypsFMSCa5MHIwL+2tHJHBwC
kwXMB7RlA7yVd57iXPzlCAf1mijjAgMBAAGjITAfMB0GA1UdDgQWBBQ20j1Jr+LG
ejzGFNVNZIHybvIstjANBgkqhkiG9w0BAQsFAAOCAQEArZ6x6qIRxhqJ8Qd20Xkf
T3NsbzEUMBIGA1UECgwLU3RhdG5ldHQgU0YxDjAMBgNVBAsMkFDs1FAjXrt8fo7S
QTVe225bCiTYgIJl7UwOAonKBZLRIhwjbh1TDij1iyNuSrX1kisVkrmtQrsNTpqH
D8m3k1M6XCUU3RV2+I6UY2WhLNvojlCYPXnQHXo5BJPDRuaXQu/OUi2cr5LVzOhC
5NdBjMUDwfsWx5NYtTK5iNvt7CBGZOXF5RgdDhZMywR0qY0pMiBjGoCxvhv9v8Ob
xk/WfbfXfcviUrb5lnqCX8NUG+/fKv09Csx0CBiXXNU+9R5HAlTZG5xptIi22CXZ
Kw==
-----END CERTIFICATE-----
subject=/C=US/ST=TX/L=Irving/O=Enkitec/OU=Bigdata/CN=enkx4bda1node03.enkitec.com
issuer=/C=US/ST=TX/O=Enkitec/OU=Bigdata/CN=Bigdata Intermediate CA
---
No client certificate CA names sent
Server Temp Key: ECDH, secp521r1, 521 bits
---
SSL handshake has read 4430 bytes and written 443 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES256-GCM-SHA384
    Session-ID: 39023B1EB131C30355F20CD8F012DCF2FFC95E1A1F9F8D8D2B6954942E9
    Session-ID-ctx: 
    Master-Key: XMB7RlA7yVd57iXPzl5EE73EAAB9B18B04B2718CAf1mijjAgMBAA5126650B5A3GjITAfM8EA269DBFE17A750EBBC5EC
    Key-Arg   : None
    Krb5 Principal: None
    PSK identity: None
    PSK identity hint: None
    Start Time: 9023528453
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---
closed

Finding out Keystore and Truststore Passwords on BDA

I am working in a project involving configuring SSL with Cloudera Manager on BDA. There are several ways to do it: go with Oracle’s bdacli approach or use Cloudera’s approach. For BDA related work, I usually prefer Oracle’s BDA approach because it needs to write some information to Oracle BDA’s configuration files, which are usually outside the control of Cloudera Manager. Cloudera’s approach is definitely working as well. But during the time when doing BDA upgrade or patching, if mammoth couldn’t find the correct value in BDA’s configuration files, it might cause unnecessary trouble. For example, if mammoth think certain features are not enabled, then it could skip certain steps to disable the features before upgrade. Anyway, it is another unrelated topic.

To enable TLS on Cloudera Manager is pretty easy on BDA, instead of doing so many steps stated in Cloudera Manager’s document. On BDA, just run the following command:
bdacli enable https_cm_hue_oozie

The command will automatically enable TLS for all major services on CDH, such Cloudera Manager, Hue and Oozie. Please note: TLS on Cloudera Manager agent is automatically enabled during BDA installation. Usually running this command is enough for many clients as client just need to encrypt the content when communicating
with Cloudera Manager. There is a downside for this approach: BDA uses self-signed certificates during the execution of bdacli enable https_cm_hue_oozie. This kind of self-signed certificate is good for security, but sometime can be annoying with browsing alerts. Therefore some users might prefer to use their own signed SSL certificates.

After working with Eric from Oracle Support, he recommended a way actually pretty good documented in Doc ID 2187903.1: How to Use Certificates Signed by a User’s Certificate Authority for Web Consoles and Hadoop Network Encryption Use on the BDA. The key of this approach is to get keystore’s and truststore’s paths and passwords, creating new keystore and truststore, and then importing customer’s certificates. Unfortunately, this approach works for BDA version 4.5 and above. It is not going to work in my current client environment, which is using BDA v4.3. One of major issue is that BDA v4.5 and above has the following bdacli commands while BDA v4.3 doesn’t have the following commands:
bdacli getinfo cluster_https_keystore_password
bdacli getinfo cluster_https_truststore_password

Eric then recommended a potential workaround by querying MySQL database directly by using the commands below:

use scm;
select * from CONFIGS where ATTR = 'truststore_password' or ATTR = 'keystore_password'; 

I then used two BDAs in our lab for the verification.
First, I tested on our X4 Starter rack.

[root@enkx4bda1node01 ~]# bdacli getinfo cluster_https_keystore_password
Enter the admin user for CM (press enter for admin): 
Enter the admin password for CM: 
******

[root@enkx4bda1node01 ~]# bdacli getinfo cluster_https_truststore_password
Enter the admin user for CM (press enter for admin): 
Enter the admin password for CM: 

Interestingly, the keystore password is still showing ****** while truststore password is empty. I can understand empty password for truststore as nothing is configured for truststore. But keystore password shouldn’t show hidden value as ******.

Query MySQL db on the same rack.

[root@enkx4bda1node03 ~]# mysql -u root -p
Enter password: 
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| activity_monitor   |
| hive               |
| host_monitor       |
| hue                |
| mysql              |
| navigator          |
| navigator_metadata |
| oozie              |
| performance_schema |
| reports_manager    |
| resource_manager   |
| scm                |
| sentry_db          |
| service_monitor    |
| studio             |
+--------------------+
16 rows in set (0.00 sec)

mysql> use scm;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed

mysql> select * from CONFIGS where ATTR = 'truststore_password' or ATTR = 'keystore_password'; 
+-----------+---------+-------------------+--------+------------+---------+---------------------+-------------------------+----------------------+---------+
| CONFIG_ID | ROLE_ID | ATTR              | VALUE  | SERVICE_ID | HOST_ID | CONFIG_CONTAINER_ID | OPTIMISTIC_LOCK_VERSION | ROLE_CONFIG_GROUP_ID | CONTEXT |
+-----------+---------+-------------------+--------+------------+---------+---------------------+-------------------------+----------------------+---------+
|         8 |    NULL | keystore_password | ****** |       NULL |    NULL |                   2 |                       2 |                 NULL | NONE    |
+-----------+---------+-------------------+--------+------------+---------+---------------------+-------------------------+----------------------+---------+
1 row in set (0.00 sec)

MySQL database also store the password as *****. I remember my colleague mentioned this BDA has some issue. This could be one of them.

Ok, this rack doesn’t really tell me anything and I move to the 2nd full rack BDA. Perform the same commands there.

[root@enkbda1node03 ~]# bdacli getinfo cluster_https_keystore_password 
Enter the admin user for CM (press enter for admin): 
Enter the admin password for CM: 
KUSld8yni8PMQcJbltvCnZEr2XG4BgKohAfnW6O02jB3tCP8v1DYlbMO5PqhJCVR

[root@enkbda1node03 ~]# bdacli getinfo cluster_https_truststore_password
Enter the admin user for CM (press enter for admin): 
Enter the admin password for CM: 


[root@enkbda1node03 ~]# mysql -u root -p
Enter password: 
mysql> use scm;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from CONFIGS where ATTR = 'truststore_password' or ATTR = 'keystore_password'; 
+-----------+---------+---------------------+------------------------------------------------------------------+------------+---------+---------------------+-------------------------+----------------------+---------+
| CONFIG_ID | ROLE_ID | ATTR                | VALUE                                                            | SERVICE_ID | HOST_ID | CONFIG_CONTAINER_ID | OPTIMISTIC_LOCK_VERSION | ROLE_CONFIG_GROUP_ID | CONTEXT |
+-----------+---------+---------------------+------------------------------------------------------------------+------------+---------+---------------------+-------------------------+----------------------+---------+
|         7 |    NULL | keystore_password   | KUSld8yni8PMQcJbltvCnZEr2XG4BgKohAfnW6O02jB3tCP8v1DYlbMO5PqhJCVR |       NULL |    NULL |                   2 |                       0 |                 NULL | NULL    |
|       991 |    NULL | truststore_password | NULL                                                             |       NULL |    NULL |                   2 |                       1 |                 NULL | NONE    |
+-----------+---------+---------------------+------------------------------------------------------------------+------------+---------+---------------------+-------------------------+----------------------+---------+
2 rows in set (0.00 sec)

MySQL database show same value as the value as the result from command bdacli getinfo cluster_https_keystore_password. This is exactly what I want to know. It looks like I can use MySQL query to get the necessary passwords for my work.

One side note: In case you want to check out those self-signed certificates on BDA, run the following command. When prompting for password, just press ENTER.

[root@enkx4bda1node03 ~]# bdacli getinfo cluster_https_keystore_path
Enter the admin user for CM (press enter for admin): 
Enter the admin password for CM: 
/opt/cloudera/security/jks/node.jks

[root@enkx4bda1node03 ~]# keytool -list -v -keystore /opt/cloudera/security/jks/node.jks
Enter keystore password:  

*****************  WARNING WARNING WARNING  *****************
* The integrity of the information stored in your keystore  *
* has NOT been verified!  In order to verify its integrity, *
* you must provide your keystore password.                  *
*****************  WARNING WARNING WARNING  *****************

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: enkx4bda1node03.enkitec.local
Creation date: Mar 5, 2016
Entry type: PrivateKeyEntry
Certificate chain length: 1
Certificate[1]:
Owner: CN=enkx4bda1node03.enkitec.local, OU=, O=, L=, ST=, C=
Issuer: CN=enkx4bda1node03.enkitec.local, OU=, O=, L=, ST=, C=
Serial number: 427dc79f
Valid from: Sat Mar 05 02:17:45 CST 2016 until: Fri Feb 23 02:17:45 CST 2018
Certificate fingerprints:
	 MD5:  A1:F9:78:EE:D4:C7:C0:D0:65:25:4C:30:09:D8:18:6E
	 SHA1: 8B:E3:7B:5F:76:B1:81:33:35:03:B9:00:97:D0:F7:F9:03:F9:74:C2
	 SHA256: EC:B5:F3:EB:E5:DC:D9:19:DB:2A:D6:3E:71:9C:62:55:10:0A:59:59:E6:98:2C:AD:23:AC:24:48:E4:68:6A:AF
	 Signature algorithm name: SHA256withRSA
	 Version: 3

Extensions: 

#1: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 36 D2 3D 49 AF E2 C6 7A   3C C6 14 D5 4D 64 81 F2  6.=I...z<...Md..
0010: 6E F2 2C B6                                        n.,.
]
]

*******************************************
*******************************************

If you dont’ like this kind of default password, you can use command keytool -storepasswd -keystore /opt/cloudera/security/jks/node.jks to change the password.

Install Cloudera Hadoop Cluster using Cloudera Manager

Three years ago I tried to build up a Hadoop Cluster using Cloudera Manager. The GUI looked nice, but the installation was pain and full of issues. I gave up after many failed tries, and then went with the manual installation. It worked fine and I have built several clusters since then. After several years working on Oracle Exadata, I go back and retry the hadoop installation using Cloudera Manager. This time I installed CDH 5 cluster. The installation experience was much better than three years ago. But not surprised, the installation still has some issues and I can easily identify some bugs during the installation. But at least I can successfully install a 3 node hadoop cluster after several tries. The followings are my steps during the installation.

First, let me give a little detail about my VM environment. I am using Virtualbox and build three VMs.
vmhost1: This is where name node, clouder manager and many other roles are located.
vmhost2: Data Node
vmhost3: Data Node

Note: the default replication factor is 3 for hadoop. In my environment, it is under replicated. So I have to adjust replication factor from 3 to 2 after installation, just to get rid of some annoying alerts.

  • OS: Oracle Linux 6.7, 64-bit
  • CPU: 1 CPU initially for all 3 VMs. Then I realize vmhost1 needs a lot of processing power as majority of the installation and configuration happen on node 1. I gave vmhost1 2 CPUs. It proved still not enough and vmhost1 tended to freeze after installation. After I bump it up to 4 CPUs, vmhost1 looks fine. 1 CPU for Data Node host is enough.
  • Memory: Initially I gave 3G to all of 3 VMs. Then bump up node 1 to 5G before installation. It proved still not enough. After bumping up to 7G on vmhost1, the VM is not freezing anymore. I can see the memory usage is around 6.2G. So 7G configuration is good one. After installation, I reduced Data Node’s memory to 2G to free some memory. If not much job running, the memory usage is less than 1G on Data Node. If just testing out hadoop configuration, I can further reduce the memory to 1.5G per Data Node.
  • Network: Although I have 3 network adpaters built in the VM, I actually use only two of them. One is configured as Internal Network and this is where my cluster VMs are using to communicate with each other. Another one is configured as NAT, just to get internet connection to download packages from Cloudera site.
  • Storage: 30G. The actual size after installation is about 10~12G and really depended on how many times you fail and retry for the installation. The clean installation uses about 10G of space.

Pre-Steps Before the Installation

Before doing the installation, make sure configure the following in the VM:
1. Set SELinux policy to diasabled. Modify the following parameter in /etc/selinux/config file.
SELINUX=disabled

2. Disable firewall.
chkconfig iptables off

3. Set swappiness to 0 in /etc/sysctl.conf file. In the latest Cloudera CDH releases, it actually recommends changing to non-zero value, like 10. But for my little test, I set it to 0 like many people did.
vm.swappiness=0

4. Disable IPV6 in /etc/sysctl.conf file.
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.all.disable_ipv6 = 1

5. Configure passwordless SSH for root user. This is common step for Oracle RAC installation and I do not repeat the steps here.

Ok, ready for the installation. Here are the steps.
1. Download and Run the Cloudera Manager Server Installer
Logon as root user on vmhost1. All of the installations are under root user.
Run the following commands.

   
wget http://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
chmod u+x cloudera-manager-installer.bin
./cloudera-manager-installer.bin

It popups the following screen, just click Next or Yes for the rest of screens.
cdh_install_installer_1

If successful, you will see the following screen.
cdh_install_installer_finish

After click Close, it will pop up a browser window and point to http://localhost:7180/. At this moment, you can click Finish button on the previous installation GUI and close the installation GUI. Then move to browser and patiently wait for your Cloudera Manager starts up. Note. It usually takes several minutes. So be patient.

2. Logon Screen
After the following screen shows up, logon as admin user and use the same admin as password.
cdh_install_logon

3. Choose Version
The next screen is to choose which version to use. The default option is Cloudera Enterprise Data Hub Edition Trial, but with 60 days limit. Although Cloudera Express has no time limit, the Express version misses a lot of features I would like to test out. So I go with the Enterprise 60 days trial version.
cdh_install_version

4. Thank You Screen
Click Continue for the next Thank You screen.
cdh_install_thanks

5. Host Screen
Input vmhost[1-3].local, then click New Search. Note, make sure to use FQDN. I used to have bad experience not using FQDN in the old version of CDH installation. I am not going to waste my time in trying out what happens if not using FQDN.

After the following screen shows up, Click New Search, then the 3 hosts shows up. Then click Continue.
cdh_install_search

6. Select Repository
For Select Repository screen, the default option is using Parcels. Unfortunately I had issue using Parcel during the installation. It passed the step of installation on all of 3 hosts, but was stuck in download the latest Parcel file. After looking around, it seems the issue was that the default release was for September version, but the latest Parcel is pointing to the old August release. It seems version mismatch to me. Anyway, I am going to try out the Parcels option in the future again. But for this installation I changed to use Packages version. I intentionally did not choose the latest CDH 5.4.5 version. I would like to go with the version has long lag in time. For example there is about one month lag between CDH 5.4.3 and CDH 5.4.4. If 5.4.3 is not stable, Cloudera would put a new release a few days later and can not wait for one month to release new version. So I went with CDH 5.4.3.
Make sure to choose 5.4.3 for Navigator Key Trustee as well.
cdh_install_repos

7. Java Installation
For Java installation, leave it uncheck in default and click Continue.
cdh_install_jdk

8. Single User
For Enable Single User Mode, I did NOT check Single User Mode as I want cluster installation.
cdh_install_singleUser

9. SSH Login Credentials
For SSH Login Credentials, input root password. For Number of Simultaneous Installations, the default value is 10. It created a lot of headache during my installation. Each host downloads its own copy from cloudera website. As three of VMs were fighting each other for the internet bandwidth on my host machine, certain VM could wait there for several minutes for downloading the next package. If wait for more than 30 seconds, Cloudera Manager would time out the installation for this host and marked as failed installation. I am fine with the time out, but not happy with the next action. The the next step after clicking Retry Failed Hosts, it rolls back the installed packages on this VM and restart from scratch for the next try. It could take hours before I could reach to that point. The more elegant way to do the installation should be download once on host and distribute to other hosts for installation. If failed, retry from the failing point. Although the total download files is about a few GB per host, the failed retries can easily make it 10GB per host. So I have to set Number of Simultaneous Installation to 1 to limit to one VM for installation to reduce my failure rate.
cdh_install_ssh

10. Installation
The majority of installation time spends here if going with Package option. For Parcel option, this step is very fast because the majority of downloads are in the different screen. The time in this step really depends on the following factors:
1. How fast your internet bandwidth. The faster, the better.
2. The download speed from Cloudera site. Although my internet download speed can easily reach to 12M per second, my actual download time from Cloudera could vary depend on the time of day. Majority of the time is around 1~2M per second. Not great, but manageable. But sometimes it could drop down to 100K per second. This is the time I have higher chance to see the time out failure and fail the installation. At one point I could not tolerate this, I wake up at 2am and began my installation process. It was much faster. I can get 10M per second download speed with about 4~7 M on average. I only saw a few timeout failure on one host.
3. How many times the installation time out and have to retry.

If successful, the following screen shows.
cdh_install_success

11. Detect Version
After the success of installation, it shows the version screen.
cdh_install_detectVersion

12. Finish Screen
Finally, I can see this Finish screen. Life is good? Wrong! See my comment in the Cluster Setup step.
cdh_install_finish

13. Cluster Setup
When I reached to this step, I knew I was almost done. Just a few more steps, less than 30 minutes work. After a long day, I went for dinner and resume my configuration later. It proved to be the most expensive mistake I have done during this installation. After the dinner, I went back the same screen, click Continue. It show Session Time Out error. Not a big deal as I thought the background process knew where I was for the installation. Open the browser and type in the url, http://localhost:7180. Guess what, not the Cluster Setup screen, but the screen at step 4. Tried many ways and could not find a workaround. Out of ideas, I had to reinstall from step 4. What’s a pain! Another 7~8 hours work. My next installation did not waste any time on this step and completed it as quickly as possible.

Ok, go back to this screen. I want to use both Impala and Spark and could not find the combination for these two except all services. So I chose Custom Services and chose the services mainly from Core with Impala + Spark. Make sure to check Include Cloudera Navigator.
cdh_setup_service

14. Role Assignment
I chose the default, click Continue.
cdh_setup_role

15. Database Setup
Choose the default. Make sure to click Test Connection before clicking Continue.
cdh_setup_database_1
cdh_setup_database_2

16. Review
Click Continue.
cdh_setup_review

17. Completion
It shows the progress during the setup.
cdh_setup_progress

Finally it show the real completion screen.
cdh_setup_complete

After clicking Finish, you should screen similar as follows.
cdh_cm_screen
The life is good right now. The powerful Cloudera Manager has much more nice features than three years ago. Really worth my effort to go through the installation.
life_is_good