2/27/2014

Public Data Identified by Namespaces to Create Neural Network #AI

What does the future of data look like?

Well for one thing, the past is not indicative of future behavior.

We've had data for a while now.  Flat files.  Relational Data.  Three Dimensional Cube data.  Large sets of data.

We have grown the size of the data.  And the complexity of the data.  And the format of the data.

However, if you add it all up, what's the one thing that's still missing?

Connected Data.  As in a Neural Network.  As in a million and a half "Data Silos".  Data not talking to each other.

Partly due to legacy infrastructures.  Vendors not communicating with other systems.  Proprietary data not viewable outside Organizations.  Inability to merge complex data sets between internal systems.  Many reasons.

What we need is a way to have data communicate with other data.  And how do we do that?

Expose the data to outside calls.  Similar to Web Services.  You give your data a uniquely qualified namespace.  You expose the Data Web Service to a Public list, so people can find your data, you specify a protocol so they can communicate and pull data, on the fly.

And you modify the way in which you store the data.  Instead of having a field called Firstname ( a Varchar String), you qualify as "UnitedStates.Florida.SafetyHarbor.BloomConsultingBI.FirstName".  In addition, you add TimeStamp of Date Created, Date Modified, Date Deleted.  The database would need to expand to allow an XML like field to describe the data.  That way, anyone querying the data set can derive the necessary information.  That info would be contained in the exposed Public WSDL, similar to Web Service standards: http://www.w3.org/TR/wsdl.

So let's say I'm a Data Professional working at a job, I need to query data.  So I search the Public Data Records, enter the type of data I'm looking for, it returns top 100 sources, I connect to the appropriate one, using some type of SOAP XML SQL standard, connect my internal data with the Public data (Cloud), join the necessary tables in the SQL Statement, instead of using the 4 part naming convention, Server. Database. Owner. Table, you would qualify it further with the Namespace. Server. Database. Owner. Table and run the query on demand, once getting past the authentication of the host Cloud Data set.

If that were to happen, you would have real time data talking to data in a Public Data Network.

And after that, the Neural Networks would already have mapped all the Public data sets and have access to them and while it was crunching algorithms in Machine Learning, it would be smart enough to query those data sets on the fly without the need for human intervention.

And that would assist to create a true Artificial Intelligence system running 24/7 unassisted learning gathering up all the data available to data mine, predict and be a super computer store house of information.

I believe IBM is moving in this direction already with it's renound "Watson" computer.

I blogged about this concept of Namespaces back in November of 2012: Namespaces for Public Community Data

and here: Public #BigData Services Exposed on the Web

This is a possible scenario in the not so distant future.  Public Data available for on the fly querying, enabling Super Computers to practice true Artificial Intelligence in a giant Neural Network.

If you add RFID to the mix you have a true Internet of Everything.

Make is so number 1 ~!

No comments:

Post a Comment