11/27/2015

We Need to Automate Everything, Artificial Intelligence to the Rescue

Artificial Intelligence is a tough nut to crack.  The goal is simulate human behavior.

So let's say we train a neural network to learn behaviors, traits and characteristics of human behavior.

For example, we teach the machine the process to dispense coffee at a coffee store.  We train the model, test and score and evaluate the results.

And we find some startling results.  Although the process of taking a customers order shouldn't vary, we find the results are all over the map.

Some customers are not satisfied with their drinks, some are oblivious and some are very satisfied.  But why is that?  If the process never varies, in theory, we should see similar consistent results every time.

So if we dive a little deeper, we find some interesting things.  First, there are many variables.  Some are known, others not known.

For example, client A got decaf coffee when he ordered caffeinated coffee.  Let's review the tape.  Well, bartista 1 took the order, processed the order and delivered the order.  According to the tapes, they performed their job as expected.  Let's ask bartista A a few questions.

Did you perform your job as expected?  Yes, the order was taken in timely manor, processed the order as expected, and delivered within timely manor.  Yes, but you gave decaf when they ordered caffeinated.  Oh really?  I'm sorry, my bad.

This could be chalked up to many factors, some obvious.  Over worked.  Not paying attention.  Read the order incorrectly.  Or it could be something else.

Let's look for the non obvious.  Perhaps the client looked like his/her ex-wife/husband, get back at her/him.  Perhaps the bartista is late on their rent, drank too much last night, not focused on their job.  

Perhaps they are unknowingly upset with themselves for their life situation, they should have studied more in school, now in a dead end job.  We'll make sure everybody is as miserable as me.  How about a nice hole in the bottom of the cup, you'll enjoy the coffee dripping when you get to your car.  How about we forget to include your donuts with your order.  

Or the opposite, how about we give this person exceptional service?  Because they look like me or they have similar tastes or interests.  

Let's apply this rule to everything, everywhere:  let's discriminate our behavior and services to our clients, based on internal bias, prejudices, beliefs and opinions.

Let's vary the outcome based on slight deviations from the actual process.

First off, many of these incidents don't get reported.  Second, nobody investigates why.  Third, the person may not even be aware of their errors.  So nothing gets corrected.

This scenario could be described as a black box.  And we typically don't know what happens in a black box.  Magic.

Yet if the process is identical and the results vary greatly, we must account for the variations of end results.  We need to explore the black box a little deeper.

And when we look closer, we see people performing the job descriptions as expected, yet with a bit of undocumented wiggle room.  Which result in varying results and outcomes.

Loose lips sink ships.  And varying from the process skews results.  And the results are based on undocumented internal bias.

Once we identify the patterns, the trick is to correct them.  How?  

Implementing guidelines, such that, there is no wiggle room for error.  Remove the barriers for deviation.  By following the prescribed pattern of actions, you can consistently provide results within expected statistical ranges.


Bartista: Listen, we know you are paying the same amount for your coffee, but I don't care for you, so I'm going to vary your order just slightly so that you get sub standard service or product.

Personally, I find this scenario to be widespread and prevalent in our society.  You can apply this formula to just about everything everywhere.  Yet nobody knows it exists or aware of it.  A silent, invisible signal affecting the outcome of everything.

I find it to be the source of chaos, confusion, corruption, and a vehicle to spread negativity into the world.

 When duplicating human behavior, we need to dive into the behaviors and processes to investigate.  We have to account for the variations in results.  And look to the reasons behind them.  I believe those deviations stem from bias and discriminating factors within the system, undocumented and unaccounted for, yet present.

We need to remove these variations from the equation.  Through process.  Documenting, Workflows, Business Rules, Implement through technology and algorithms.  To prevent unknowns from the equation.  For consistent patterns and results.

Then monitor results and modify to self correct on the fly.     

By streamlining processes, we can ensure quality service across the board.  This should reduce costs from having to correct problems downstream and customer retention.

If we matched human behavior 100%, we'd need a function to simulate stupidity, biases, prejudices, hatred, vengeance, revenge, evil, etc.  

We need to automate everything.  Remove the human factor which is tainted, biased, prejudiced and inconsistent.  I think this is the key to fixing a lot of systemic problems in every aspect of everything on the planet.  

Automate.  Remove bias.  Streamline.  Measure.  Repeat.

And the vehicle to automate this is called Artificial Intelligence.

11/23/2015

The Artificial Intelligence Gap

In the world of Artificial Intelligence, we have this image of computers rising up, dethroning humanity as the alpha species on the planet.

They call it the Singularity.  The moment computers wake up.  Become conscious.  Also known as Strong AI.

It could happen.  But probably not for a while.

So where are we at today?  We have gotten pretty good at identifying patterns in the data through a technique called Supervised Learning.  In that we can read mounds of data, where the data points are sufficiently labeled, and we can associate mappings between source and outcome in a paired value scenario.  And this is used to base predictions about future behavior.

We also have narrow or weak AI, which is basically Siri or Cortana.  Personal assistants that can return information based on voice recognition commands, kind of like Star Trek.

The part that is most difficult is the Unsupervised Learning.  Throwing data at an algorithm and having it deduce value and insights without human intervention or labels attached to the data.

There has been considerable progress in AI machines learning, by teaching themselves, in a simulated environment.  However, the AI systems are basically experts at a single domain, not multiple domains.  Likewise, the environmental variables within their world are limited to some degree.

So we have advanced the field of Artificial Intelligence a lot since the 1950s.  And the AI Winter and lack of funding of the 1980s.  And in the 2000s we've witnessed cheaper hardware, better processing power, advanced algorithms and better storage capacity.  As well as having advanced AI software in the hands of mere mortals, as in not PhD academia.

Will computers rise to the point that they displace humans.  Not yet.  But I wouldn't count them out just yet.

New Azure Virtual Machine Preconfigured for Data Science

Microsoft Azure is a platform to do just about anything.  They have so many software tools to choose from, it's an entire ecosystem on the web.

Since I've been programming in Microsoft since Visual Basic 4.0, I'm a definite fan.

Today there was a blog post: http://blogs.technet.com/b/machinelearning/archive/2015/11/23/announcing-the-availability-of-the-microsoft-data-science-virtual-machine.aspx introducing a new product, a Machine Learning pre-configured Virtual Machine.

So naturally I logged on, and proceeded to create the new VM:










Deploying:



Now we start the Virtual Machine:



So we're up and running.  And it appears, this new Virtual Machine has lots of Data Science goodies to get started with.

Including Azure Storage application to move files up to Azure Storage, Power BI Desktop for reporting and visualizations as well as connecting to different data sources and massaging the data.

It's got Revolution R application.  SQL Server 2014 limited edition.  Visual Studio 2015.  Python.  And links to Azure.  And Power Shell.

So if you need a server in a hurry, this is a great option.  It's funny though, I already have every application installed on my laptop.

So there you have it~!

11/20/2015

How to Solve the Rubik's Cube

If a tree falls in the woods, and nobody's around to hear it, did the tree actually fall?

Good question.  Maybe it did.  Maybe it didn't.

When I was in the 5th or 6th grade, I solved the Rubik's cube.  Some people knew, some didn't.

When I moved to Florida, nobody knew I could solve it.  It basically never happened.  In fact, they placed me in remedial math, with students who couldn't add or subtract.

That was a real treat.  In fact, I went the next 5 years without speaking basically or raising my hand to contribute in class, although I was on the honor roll every quarter.

Not one teacher made an effort to know a thing about me.  And I didn't offer any information to the contrary.

I would say, if a kid can solve the Rubik's cube and nobody's around to see it, did it ever happen.

Well, yes actually.  Here's a video I created on how to solve the cube:



So if any of my teachers from middle or high school are still around, have a watch.  This would probably be the first time you realize what my voice sounds like.

And so it goes~!

PS. as a follow up, I purchased a copy of Camtasia Studio on Cyber Monday 30% off.  Good stuff~

11/16/2015

Internet of Things through Azure Streaming Data and Azure Event Hubs

After getting carried away by the Big Data hype, I was more than skeptical about the Internet of Things.

Sensors in every piece of equipment.  Sure, but how does that affect me?  I only work with the data.   I don't build washing machines or have access to sensors.

True, but from a data professional perspective, the data is where it at.  And where is the data, it's going to be everywhere.  From everything.

Little pulses of information.  Sent continuously.  From just about anything.  How will the data get transmitted?  Well, it sounds like radio frequencies connected to a hub that routes the messages to the central hub.

Could be millions of incoming messages per second.  And there are tools to assist with the streaming of these messages.  Actually called Streaming Analytics, in Azure, from Microsoft.  An analytic platform to watch the data in flight.  Look for anomalies, send alerts, etc.

And how will these messages find there way?  Service Hub.  Another Azure product.  Similar to Service Broker, which is a messaging queue like application baked into SQL Server.  Doesn't require source or destination to be in same network, guaranteed delivery, in the order in which it was sent.  Event Hub is built on a concept called "partitions".  This was designed to handle issues of blocking from Service Broker.

The data professional will set up the infrastructure on the Azure side and then, once the data starts flowing in, there's lots to identify, in the Streaming data.  The incoming messages may not be verbose, it's like being eaten by little mosquito's,  tiny, yet when combined with a swarm of others, potent.

Now I see the potential impact of the "Internet of Things".  Monitoring millions of devices in real time 24/7.  Processing each message and detecting anomalies, then basing some kind of action downstream.

So perhaps Big Data wasn't all hype after all.  Because these little messages of streaming data from IoT will surely amplify our data, exponentially.

And there you have it~!

11/15/2015

What is Cortana Analytics Suite from Microsoft Azure?

Microsoft Azure has many tools for developers.  One of the new suites to appear is the Cortana Analytics suite.  It's comprised of multiple technologies, some you may already know.

Cortana Analytics is a combination of:
  • Azure SQL Database
  • Azure HDInsight
  • Azure Machine Learning
  • Azure Streaming Analytics
  • Azure Event Hub
  • Azure Data Lake Analytics
  • Azure Data Factor
  • Azure Data Catalog
  • PowerBI
Many of you are probably familiar with Azure SQL Database.  An Online version of SQL Server, either run by Azure entirely or where you maintain and support a version running in a VM.

Azure HDInsight is the Microsoft Cloud version of Hadoop.  It complies with all the Apache Foundation Hadoop features and scales nicely.

Azure Machine Learning is a sandbox to build experiments using Machine Learning algorithms.

Azure Streaming Analytics allows developers to watch data in flight, typically working with Azure Event Hub.

Azure Data Lake Analytics is a new product currently in Preview.  It allows big data developers a new platform to store and query big data across any language.

Azure Data Factory is a process of automating data movement in the cloud

Azure Data Catalog is a solution in the Cloud to keep track of your data sources, their metadata, configure who has access to which data sources and allow search, tag and big picture of data ecosystem.

PowerBI is a tool for bringing in data, mashing with other data sources, adding business rules, staging the data, creating reports and dashboards and collaborating on the data to derive insights.  There are three flavors, traditional Excel Add-Ins, Power BI Desktop and PowerBI.Com.

So there you have it, Cortana Analytics Suite.  An ecosystem based in Azure cloud to work with data from just about any angle.  To derive insights and run your business.


Mountain Living