4/15/2013

So You Want to Become A Data Jockey? Read this!

So if you plan on learning the profession of reporting, you've got your work cut out for you.

First you'll have to learn relational databases.

And how to write SQL queries to pull data into a report.

Then you'll have to un-learn relational databases.

And do data warehousing using star schemas for fast data retrieval.

Then you'll have to forget both relational and denormalized databases, and learn NO-SQL.

That's short for unstructured or semi-structured databases, where the schema is formed on the fly by the developer.

Then you'll have to learn Big Data, so get your Java skills sharp to pull some Big Hadoop Data using Map Reduce.

You'll be able to access Hadoop using a psudo SQL like language which converts your HIVE queries to Map-Reduce under the hood, somewhat fast.

But if you wait a little longer, you'll be able to write SQL that by-passes the HIVE to Map-Reduce conversion and go straight to the data which is wicked fast (Cloudera Impala).

And next you can hook into Hadoop using ODBC drivers to access from Excel and Microsoft ETL tool called Integration Services (SSIS).

So like I said, if you would like to be a data jockey, you'd better start learning.

Because we haven't event mentioned Graph Databases which are good for relationships between key-value pairs, or predictive analysis, data mining, sentiment analysis and streaming data.

To summarize, life in the fast lane is never NULL.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Thoughts to Ponder