10/18/2013

#SSIS ODBC Connection to #Hortonworks #Hadoop

If you are interested in connecting Microsoft SSIS to Hortonworks HDFS, the first thing you need to do is download the appropriate driver(s):

http://hortonworks.com/download/download-archives/

I had to load both the 32-bit and 64-bit, when configuring the ODBC, be sure to run as Administrator.


After configuring the ODBC (turns out the 32-bit worked on my pc), you create a new project in SSIS, create a new Connections (ODBC), point to your newly created data source.


You can pull directly from a table or write a custom SQL statement as long as it meets the HIVE standards as HIVE is not truly SQL Compliant.


You can view the sample table here:


So in the SSIS package, I'm basically pulling from the ODBC source from Hortonworks HDFS and transform the results into a local instance of SQL-Server:


Running package successfully:


We query the result in SQL:


So to summarize, we installed the Hortonworks ODBC driver, configured it, created an SSIS package, created a SQL statement pointing to Hortonworks HDFS, ran the job and the results appeared in a SQL Server localhost table.

One thing to note, you may have to tweak your Hosts file:
 
 
While at the SQL Pass Summit 2013, I spoke with Adam Diaz at the Hortonworks booth who helped me with this setting.  Thanks!
 
And there you have it!

Happy Hadooping!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.