#SSIS ODBC Connection to #Hortonworks #Hadoop

If you are interested in connecting Microsoft SSIS to Hortonworks HDFS, the first thing you need to do is download the appropriate driver(s):


I had to load both the 32-bit and 64-bit, when configuring the ODBC, be sure to run as Administrator.

After configuring the ODBC (turns out the 32-bit worked on my pc), you create a new project in SSIS, create a new Connections (ODBC), point to your newly created data source.

You can pull directly from a table or write a custom SQL statement as long as it meets the HIVE standards as HIVE is not truly SQL Compliant.

You can view the sample table here:

So in the SSIS package, I'm basically pulling from the ODBC source from Hortonworks HDFS and transform the results into a local instance of SQL-Server:

Running package successfully:

We query the result in SQL:

So to summarize, we installed the Hortonworks ODBC driver, configured it, created an SSIS package, created a SQL statement pointing to Hortonworks HDFS, ran the job and the results appeared in a SQL Server localhost table.

One thing to note, you may have to tweak your Hosts file:
While at the SQL Pass Summit 2013, I spoke with Adam Diaz at the Hortonworks booth who helped me with this setting.  Thanks!
And there you have it!

Happy Hadooping!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.