We as humans have 5 senses, not including Intuition. They are as follows (from http://en.wikipedia.org/wiki/Sense)
Data is a thing. It can't be heard, tasted, smelled or touched. It can only be seen. That's very limiting in the fact that it's responsible for so much. Our brains have a limited ability to grasp data because it can only be viewed. Therefore our understanding may be limited.
However, I believe data is similar to Energy. Like energy, it can be stored, it can remain at rest, or it can be transferred. So Databases are simply storehouses for energy units called data.
And because the data is at rest, it's potential energy, waiting to be released / consumed. You exert gravitational energy by issuing a SQL Query command. Where a copy of the energy is sent to the requester, thus releasing it into the wild.
The speed at which that energy / data is transferred is dependent on the specifications of the host server, the amount of RAM, as well as the network connecting the two entities. The size of the data also makes a difference. And at it's most simple form, data is a series of bits and bytes. There is work today to place this onto a single atom.
Which is good because volume of data is accumulating at phenomenal pace. We're going to need a way to store this energy somewhere. So it can be accessed, interrogated and consumed.
Data can be stored in a variety of formats. There's no universal standard dictating how it should be stored. This poses problems downstream. It would be nice to have a standard for storing Public data. In doing so, there would need to be universal storing definition language adhered to by Data Professionals. However, with the rise of UnStructured Data, this complicates matter. How do you store Pictures, a standard format type, what pixel level, etc. Obviously this would be a difficult task but a standard could be introduced and implemented at a specific point in time where people must comply. Because right now it's a free for all. And having data mix with foreign data sets is extremely difficult.
Data shouldn't be monogamous. It should have multiple partners. It should be able to see and communicate with all sorts of other data. To form relationships. Establish patterns. Based on Fuzzy logic. And that data should form a network. As data should have built in descriptive information about what that piece of data consists of. Where was it originated from, by whom, when, how and why. Other characteristics should be embedded within it as well. So data can communicate with other data, "hey, I'm a piece of data with these characteristics, you there, do you have similar characteristics, if so, let's create a bond between us, so other people can associate a link between us for later use. It's that relationship between the data that has the real value. Graph technology could form networks of similar data points to find patterns not visible with the naked eye.
Going forward, Models could interrogate those relationships and build a storehouse of information, thus training the Model, for true Artificial Intelligence. Because once the Model is trained, it can be used for a variety of purposes. Where it grow in intelligence, can determine choices based on probabilities and past experience, and can learn new things which get added to its memory banks.
And one day, that Model could exhibit human like qualities, of thoughts and feelings, to eventually become a conscious life form. The birth of a new being, smart, clever, with a good memory, ability to learn, to evolve and grow.
But in order for this to happen, data needs to have standards, data needs to be self describing, data needs to form links to other data based on similarities, and those common links can be used to build Models, which can be trained, to perform true Artificial Intelligence. I think it's possible.
I signed up for the Hortonworks Certified Associate exam last Thursday. Figured if I sign up, I'd have to take the test. And if I tak...
This blog post is in no way an attempt to steal other people's work. It's basically an conglomeration of notes from research I did...
Saw a post today on Twitter, " Microsoft releases CNTK, its open source deep learning toolkit, on GitHub " This is big news. Be...