Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / ASP.NET

.NET to Hadoop Connection using Kerberos Ticket

5.00/5 (4 votes)
26 Mar 2016CPOL3 min read 25.9K   168  
.NET to Hadoop connection using Keytab file

Introduction

After struggling for 2 days, finally I figured out how to connect .NET code to Hadoop using Hadoop Keytab file. I was unable to find an article or solution on Google that could help me accomplish this. As this code is purely my inception, please let me know your suggestions to improve it better.

A keytab is a file containing pairs of Kerberos principals and an encrypted copy of that principal's key. A keytab file for a Hadoop daemon is unique to each host since the principal names include the hostname. This file is used to authenticate a principal on a host to Kerberos without human interaction or storing a password in a plain text file (source).

Hadoop Configuration File

The krb5.conf file contains Kerberos configuration information, including the locations of KDCs and admin servers for the Kerberos realms of interest, defaults for the current realm and for Kerberos applications, and mappings of hostnames onto Kerberos realms. (source)

How It Works

The scenario I am explaining here is about connecting .NET C# application to Kerberos authenticated Hadoop Server. This article is just about the Connection part so I would not be explaining Hadoop concepts. In order to make a successful connection, the following steps are involved:

  1. Setup Hadoop Server Configuration information
  2. Generate Kerberos authentication ticket based on the Hadoop Keytab file
  3. Create an ODBC connection to Hadoop Server

Pre-Requisites

The following pre-requisites should be installed before connection is established between .NET and Hadoop:

  1. Install MIT Kerberos for Windows from this link
  2. Install Microsoft Hadoop ODBC driver from this link

You need to have the following information:

  1. Hadoop Configuration file (krb5.ini)
  2. Keytab file (HDDev.keytab)
  3. Hadoop Server Host name
  4. Hadoop Server Port address (default is 10000)
  5. Hadoop hostFQDN
  6. Hadoop Service Name
  7. Hadoop Principal account (explained below)

Detail

After installing the MIT Kerberos, copy the Hadoop Configuration file (krb5.ini) to location C:\ProgramData\MIT\Kerberos5 (change the path depending on your installation location).

Copy the Keytab to whichever location you want as per your convenience. In my demo, I have copied it to projects Bin\Debug and Bin\Release folders.

After installation of MIT Kerberos software, you would be able to generate Kerberos ticket using the kinit command. The syntax for using the kinit command is:

kinit -k -t HDDev.keytab hadoopDevPrincipal@HDP.DEV

In this syntax, HDDev.keytab is the keytab file. You can also specify the full path of the file if you want in the command syntax. Example: kinit -k -t “d:\test\HDDev.keytab” hadoopDevPrincipal@HDP.DEV

For connecting to Hadoop, Kerberos principals are required. It reads the authentication information saved in keytab file with appropriate permission. In my demo, I have used it as hadoopDevPrincipal@HDP.DEV which is obviously fake for demonstration purposes but it will give you an idea about the format of Kerberos principal account.

You can further add switches in command to configure the Kerberos ticket expiry, etc. For more documentation regarding the kinit command, please refer to this link.

Now, we are ready to jump into the code and make a connection.

Step 1

Execute the kinit command, providing the Keytab file and principal account, to generate the Kerberos ticket.

C#
string.Format("-k -t \"{0}\\{1}\" {2}",
                Environment.CurrentDirectory, //Path of Bin\Debug directory
                ConfigurationManager.AppSettings["keyTabFileName"],
                ConfigurationManager.AppSettings["principal"]);
ProcessStartInfo psi = new ProcessStartInfo("kinit")
            {
                UseShellExecute = true,
                RedirectStandardOutput = false,
                RedirectStandardInput = false,
                RedirectStandardError = false,
                CreateNoWindow = true,
                WindowStyle = ProcessWindowStyle.Hidden,
                Arguments = path
            };
Process process = Process.Start(psi);

Step 2

Create ODBC connection to Hadoop using Hadoop server information:

C#
OdbcConnection conn = new OdbcConnection(
                                   string.Format(@"DRIVER={{Microsoft Hive ODBC Driver}};
                                        Host={0};
                                        Port={1};
                                        Schema={2};
                                        HiveServerType=2;
                                        AuthMech=1;
                                        KrbHostFQDN={3};
                                        KrbServiceName={4};"));
conn.Open();
AuthMech=1 specifies the Kerberos Authentication Mode
After this Hadoop queries can be fired normally as we do for SQL Server or Oracle:
OdbcCommand cmd = new OdbcCommand("select * from Schema_Name.Table_Name;", conn);

Using the Code

Download the Hadoop Connector.zip. Please replace the AppSettings in Web.Config with your Hadoop settings:

XML
<appSettings>
              <!-- Hadoop Settings -->
              <add key="host" value="hostname" />
              <add key="port" value="10000" />
              <add key="schema" value="Schema_Name" />
              <add key="hostFQDN" value="hostname.domain.com" />
              <add key="serviceName" value="Service_Name" />
              <add key="principal" value="hadoopDevPrincipal@HDP.DEV" />
              <add key="kerberosAquireTicketCommand" 
               value="kinit -k -t HDDev.keytab hadoopDevPrincipal@HDP.DEV" />
              <add key="keyTabFileName" value="HDDev.keytab" />
</appSettings>

In the code at line number 19, change Environment.CurrentDirectory to relevant file path.

In the code at line number 67, replace the query statement with your relevant Hadoop query.

Summary

As stated earlier, this is just my inception, please do provide your suggestions or optimizations that can be implemented.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)