8/29/2015

From Reporting to Hadoop to Machine Learning to Black Box Algorithms

Back in the day the DBA was the gate keeper to the data.  Very tight access.  They had the keys to the kingdom.

Programmers were granted access to the applications, reluctantly.  First client server, then web and now mobile.

And then there were the report writers or Business Intelligence people.  DBA never liked them much.  Always pulling reports, slowing down the database, causing locks.  Writing crappy code.

Enter Hadoop.  I didn't see many, if any, DBA's making the leap into Hadoop.  Since they don't like writing SQL, and they don't program Java, Python, Scala, Hive or Pig, why would they like Hadoop.

Report Writers - Business Intelligence - Data Warehouse people, are they getting into Hadoop?  Some.  They have an understanding of the data, although Unstructured and Semi-Structures is a bit foreign.  They can mount the data into SQL Tables in Hive and access from a variety of sources, including ODBC, OData feeds, Power BI, etc.  Just turn the data into Relational and they're off and running.  But they don't necessarily program Java, Python or Scala, from what I've seen.

So if the main players in the data space, the DBAs and Report Writers / BI / Data Warehouse don't have a clearly defined entry way into the world of Hadoop, that could explain why there was more hype than real life Hadoop growth.

From what I saw, the people working with Hadoop, were out of college, or start ups or major companies with financial resources to bring in top developers.  Perhaps now, more companies are jumping on the bandwagon.

It used to be, "What is Hadoop?"

Then, "What do I do with Hadoop?"

Now, Hadoop is another tool in the tool chest, for working with data.

The "shiny" newness wore off.

For companies entering the Hadoop space, I suppose they need a real world question to answer, a use case.  Then create a Hadoop sandbox, obtain a developer and admin, and start ingesting data.  Mash it around.  Look for insights.  Use those insights to run the business.

Reporting has always been past tense, numbers explained what happened.

The hot thing now is Machine Learning, algorithms and statistical analysis.  This allows forward thinking, predictive analysis, forecasting.  This space is hot right now.

Looking forward, algorithms will become a commodity.  Just pick one off the shelf, integrate into your app, and you're off and running.

There's already sites out there now to leverage existing black box algorithms.

You can't expect people with 20 years of experience in IT to magically produce a PhD degree, it's impossible.  The train left the station two decades ago.  Instead, bring the complexity down to a level that Data Professionals can work with it.

That's how I see it.  Time will tell.

No comments:

Post a Comment