When I hopped aboard the Hadoop "express" around 2012, I really thought it would take over the world of data. There was so much hype and talk about the expectations and progress and adoption. Yet, when speaking with industry folks, there weren't many use cases in production, mostly used as a store house to keep all the data, at relatively low costs, across cluster of commodity hardware.
And as we know, it was primarily Java back then, and Java folks weren't traditionally Data folks, and Data folks didn't know or care to know Java. Catch 22. Plus the administration side, managing volumes of hardware infrastructures, including the software required to keep the thing running, there was no easy path from legacy SQL developers into the world of Hadoop.
And the hype continued, lots of new players, connectors, new Apache products, then 3rd sort of dropped from the top 3 vendors providing proprietary Hadoop experts and consultants. And now, it appears the two rivals, Cloudera and Hortonworks, have teamed up, in an buyout / merger, combining consolidation of industry expert Hadoop vendor.
Keep in mind, there's also a Cloud based version of Hadoop, from Microsoft, called HDInsight, which contributes open source code to the Apache projects.
I guess everyone is waiting to see the next steps in the life cycle of Hadoop, how the customers will react, how people will know which flavor to run, maintain and support going forward. All in all, Hadoop is a great product, designed for specific purpose, to store large volumes of data across commodity hardware, fault tolerant and integrate with other products. However, it never matured to the point of accepting up to the minute, transaction level production databases, as in inserts, updates, deletes in real time, therefore, Hadoop is now another tool in the Data Tool Belt. A great product, for specific use cases.
And so it goes~!