It seems like open source applications are the mainstream today. So many new products delivered through Aache foundation. Some do this. Some do that. Small, mid size to large community of open source developers and committers. Self democratized communities with scheduled releases, anyone can contribute. Kind of makes traditional software shop framework seem legacy.
I remember working with the Microsoft Access database. The cool thing about it, you could quite easily create some tables, load the data or connect to an Visual Basic app or Classic ASP app to insert and update and delete records. Create Queries or Views, which connect to Reports or externally through ODBC to let's say a Crystal Report, or build custom reports right in Access. As well as connect to other data sources, such as Oracle or SQL Server or Excel or CSV or Text files. You could also write custom VBA code. And call Access from Visual Basic through Ole. It did everything.
So I was thinking, the Internet of Things. You basically have a device out in the wild. Maybe a thermostat connected to your home, accessible from the Cloud, control the room temperature perhaps.
How does that work? Well, it probably has a sensor in the device. And that hardware probably has an operating system. And Bios. And a version. That operating system hosts some application(s). Which have install dates, versions, file dependencies like DLLs and some APIs for remote calls.
And the device needs to capture metrics and store locally and push or pull to the network. So the device should have access to a network through a network card or wireless connection or radio frequency.
So the packets of data can be transported to and from the device. And tracked from the central hub. To take heartbeat readings to ensure it's active. And push updates to software, hardware bios, network configurations, applications bug fixes and remote calls to sleep, reboot or shut down and verify it has power supply.
The data captured by the central hub should be stored in some repository like a database or big data ecosystem. And there could be some streaming mechanism to watch the data as it flow in, to trigger updates downstream as alerts.
So that's basically the IoT ecosystem in rudimentary form.
So if we look at Hadoop for a moment, its' a parallel processing system with data distributed on a series of commodity servers. There are processes and daemons and jobs that keep track of where things are, what's running at the moment or scheduled to run, if servers go down or corrupted, etc. They run in a distributed network and disperse jobs to their specific location in parallel or joined together to return result sets. However, they are joined in a specific network.
So what if we imagined the IoT devices out in the field to be encapsulated in an external network, as in decentralized. The central hub knows the location of each device, it's status, and communicates periodically to issue commands, pull data or maintain contact. That's sort of like a distributed architecture of commodity hardware connected through some network device. It's like a Hadoop system but decentralized.
So what if we issued a command, similar to Map/Reduce, where the job got submitted, pushed out to the network, where it grabbed the information requested, to perhaps a million devices, that data was reduced to a central location, and result set brought back to the central hub. It's basically a micro Hadoop, decentralized, across multiple hardware devices (think commodity servers), actually sensors in devices in things, in parallel processing.
Like a self contained Access database residing within each IoT device in the wild. Can store data, connect to other data, pull queries, maybe display local reports, with connection back to the central hub, running on an operating system.
Sort of like a Hadoop for IoT. Open sourced. With a team of developers and committers. For others to leverage and integrate into current and new solutions.
I signed up for the Hortonworks Certified Associate exam last Thursday. Figured if I sign up, I'd have to take the test. And if I tak...
This blog post is in no way an attempt to steal other people's work. It's basically an conglomeration of notes from research I did...
Saw a post today on Twitter, " Microsoft releases CNTK, its open source deep learning toolkit, on GitHub " This is big news. Be...