2/28/2017

What is your Most Valued Asset

What is your greatest asset?  Most financial advisors will tell you your home is your most valued asset.  Or your cars.  Or airplanes.  Or portfolio.

I'd say those are indeed great assets.  However, incorrect.

Your most important asset is your mind. 

It's completely expandable and elastic.  Our brains have no limits.  We use a fraction of our brain capacity for the duration of lives.

If we could visualize a person's brain, some might be 500 pounds overweight.  Because they are fed garbage or never used.  Similar to eating junk food, we feed our brains hours of television, never read books or attempt to learn new skills.

Yet other's may be similar to a high powered racecar.  Because they are exercised and finely tuned.  Like the award winning body sculpted Arnold Schwarzenegger.

The key takeaway, each of us has a brain.  It's our choice how we develop it.  We can exercise our minds by learning a musical instrument or foreign language.

The task of memorizing facts is admirable, yet not required for many occupations.  I'd suggest most jobs could be learned within a few weeks and vary slightly over time.  People get into the groove and deathly afraid of change.  How beneficial is that for the mind?

Once you determine that your brain is your best asset, and you take steps to develop, grow and maintain, you'll soon realize that the restrictions you face in life are self imposed.

If you depend on teachers, family or life circumstances for your outcome in life, you become dependent on the system.  And blame everyone within distance for your troubles.

Sorry to break the bad news.  You determine the outcome in life.  And it starts with training the mind.  Because the mind is your most valued asset.  Learning is the key to training the mind.  And there are no limits on learning.  The more you learn, the more in demand you become.  The more in demand, the more opportunities.  The more opportunities, the more freedom.  It also becomes apparent that the more in demand you become, the more salary your earn.

With more money, sure you can purchase a home, cars or airplane. Yet it's the mind that builds the foundation.

So are you going to feed your brain junk food.  Or train it to change your life?

Your move.

In Hot Pursuit of Artificial General Intelligence

Artificial Intelligence is making strides in the world today.  AI is baked into everyday web sites to predict, classify, cluster and churn through  data.  The thing to remember is this.  Today's AI is considered "weak". 

Computers are not self aware, they are not living beings, they do not have personalities.  This type of AI is considered "general" or AGI.

The experts know the truth of the matter and that is, AGI is a long way off, if not impossible.

The reasons are many.  We are attempting to mimic the human brain.  Nobody really understands the true under workings of the human brain.  And by the way, mimic male or female brain, nobody seems to know.

Is personality based on DNA and genes handed down by generation or is personality derived from culture.  No concrete answers.

Believe it or not, Humans may not be the most intelligent species in the Universe.  Attempting to mimic the Human thought process, may not be a high enough achievement, sorry to break the news.

If AGI beings were developed and were to interact with Humans, they will need to understand our dynamics.  As in Humans tend to not base lives on logic.  Humans are capable of various behaviors such as envy, greed, revenge, favoritism.  It may be difficult for AGI beings to understand how we tick, and may find our behavior quite bizarre.

Your behavior does not computer.  Input parameters do not align with expected output.  The Human processors must have a bug or two and are a few versions past due on their service packs.

Corporations are manmade inventions that possess certain characteristics of Humans.  So too could AGI beings.  Autonomous creations that mimic humans, yet lack accountability.  Do we have guardrails in place to handle downstream anomalies that may arise.

The wheel was a great invention.  As was electricity.  AGI is a technology ready for mainstream and has been for 50+ years.  Like any tool, it can benefit or hinder mankind.

We've already stated that Humans tend to behave in patterns that defy logic.  And perhaps the main concern is who controls the tools and what are the intentions.

Either way you slice it, the pursuit of AGI will continue until solved.

And so it goes~!

2/21/2017

The 10 Worst College Majors


The 10 Worst College Majors via

My Anthropology / archeology degree ranked #1.

After graduation, I worked temp jobs at minimum wage.  Luckily I was self taught programmer age 13 and clawed way into IT department.

2/18/2017

Tableau Parameter Modification Using Calculated Fields and Filters with Conditions

Working on a Tableau project recently, we discovered a bug regarding Parameters.  The base Dashboard had a Parameter for Year and Quarter.  Each populated based on a list:

Year:
2016
2017
2018
2019
2020

Quarter:
Q1
Q2
Q3
Q4

Looking through the code, the Year filter was pointing to the incorrect database field from Salesforce.  Thus, when the Parameters changed, the reports displayed incorrect data.

A Calculated Field set the Year field to the passed in Parameter:



In order to change the Parameters, I located the correct field to filter on.  Then created a Calculated Field to obtain the Year as follows:




Then added the new Calculated Field to the Filters, opened and under "Condition" added the following logic to filter the data set where the Year part = the passed in Parameter "Year":




For the Quarter field, very similar.  Created a new Calculated Field for Quarter, to obtain the Quarter fragment of the date field:






Next, add new Calculated Field to the Filters pane, open to "Condition" and add the following logic:




Essentially, it strips out the "Q" character from Q1, Q2, Q3, Q4 passed in value from Parameter [Quarter 1] which the user selects, converts to an Integer, and filters the Quarter date field where the Quarter is less than or equal to the Parameter value minus the "Q".

So if user selects Q3, strip out the "Q" resulting in 3 as Int, and filter date Quarter field for 1, 2 or 3 since that is equal to or less than 3.  It excludes Quarter 4 because 4 is higher than 3.

And these code modifications resulting in accurate results in Tableau Dashboard when user changes either the [Year] or [Quarter 1] parameters.

Hope that helps in your Tableau development.  Thanks for reading~!

2/15/2017

Intro to Statistical Learning Notes from Online Course

When thinking about machine learning there's a lot going on.
 
Inference attempts to understand the relationship between the predictors and the results.  If we send in a value of 10 for parameter 1, the result is something.  If we send in a value of 20, the results is something else.
 
Prediction attempts to fit the model such that the relationship between the predictors and the results can identify future accurate results.
 
Both of these are included in Supervised Learning which typically have a Predictor and a ResponseLinear Regression and Logistic Regression are classic version.  Newer techniques include GAM, Bootstrapping and Support Vector Machines.
 
The alternate approach is known as Unsupervised Learning.  UL also has Predictors but no Response.  Basically it attempts to organize the data into buckets to understand relationships and patterns known as Clustering or Cluster Analysis
 
There is actually another approach known as Semi-Structured Learning which combines the two.
 
Another point of interest is the differentiation between Flexibility and Interpretability
 
Some methods are restrictive and inflexible, yet they are easy to interpret such as Least Squares or Lasso.  These are typically associate with Inference and Linear Models.
 
Opposite methods are flexible like thin plate splines yet more difficult to interpret.  Flexible models are associate with Splines and Boosting methods and seeing the relationship between predictor and results is rather difficult.
 
Parametric Methods have a two step approach: 1. assume the relationship of data points is linear 2. apply a procedure to fit or train the model using training data.  One possible affect is overfitting the model when the results are too accurate and they account for the noise or errors to closely.
 
Non-Parametric Methods attempts to estimate to the data points as close as possible and typically performs better with more data.  Thin Plate Spline is one method for fitting the data.  It too can be overfit.
 
Another topic is Quantitative and Qualitative.
 
Quantitative involves numerical values as it has the word "quantit" to help remember.  These are Regression problems such as Least Squares Linear Regression.
 
Qualitative has Classes or Categories.  These classes are sometimes binary as in True/False, Yes/No, Male/Female, or Group1, Group2 or Group3.  These are Classification problems such as Logistic Regression.
 
The main takeaway is there is no one silver bullet to apply to every data set.  It's the responsibility of the analyst to decide which approach works best for a particular situation as results can vary.
 
For Regression problems the Mean Squared Error or MSE can determine the quality of the results.  It's useful for testing data rather than the training data.  The lower the MSE the better as fewer errors translates to more accuracy.
 
There are Qualitative which are the functions and parameters part of the equation.  The Irreducible Errors which are the downstream errors known as Epsilon.  Reducible Errors can be tweaked, Irreducible Errors can not.
 
One way to offset the Reducible Errors is to account for Bias or Variance.  Flexible models tend to have higher variance while inflexible models tend to have lower variance.  All regression models should contain some variance or errors or result in overfitting.
 
The Bayes Classifier is associated under the Classification spectrum which is based on Conditional Probability.  The segment of the chart where the probability is exactly 50% is known as the Bayes decision boundary and the lowest possible error is called rate is termed Based error rate similar to the irreducible error.  Since the classifier is based on classes, it always chooses the largest class.  Although this method is highly accurate, it's difficult to apply in real life scenarios.
 
K-Nearest Neighbors attempts to estimate the conditional distribution and then classify the highest estimated probability.  Although a simpler method, it is fairly accurate compared to the Bayes Classifier.
 
This blog attempts to summarize the course I'm attending online from Stanford Statistical Learning.  I'm paraphrasing and identifying the key concepts in an effort to organize and remember.  I use this technique to learn and self teach and in no way are these my original thoughts.  I'm reading from the assigned book for the course titled: "Springer Texts in Statistics" found here and they deserve all the credit!  The course can be found here which I highly recommend.

Stay tuned for more blog post and thanks for reading~!

2/14/2017

Attending Stanford Statistical Learning MOOC

I signed up to attend the Stanford Statistical Learning course.  It's free and self paced.  There's a lot to learn.

The course is structured in a series of lectures.


Each lecture, one of the two professors explains various aspects of Machine Learning.

Then you can view the Lecture Slides at your own pace.


Then the R Sessions, as this course uses the R statistical language.


And to get up to speed on the R Statistical Language there are e-books available for download and consumption.


As well as a tab to review your progress, as there are quizzes and the passing score is 50%.  As well as a Discussion tab to see previous questions and answers to course related material.

Overall, I'm enjoying the class so far. 

First, you have to get an understanding of the concepts of Machine Learning.  Linear Models, Classification, Clustering, Supervised and Unsupervised Learning. 

Second, the course uses advanced math to explain how the algorithms work.  If you aren't up to speed on advanced math, it's a bit of a challenge. 

Third, you learn the R Statistical language from reading the e-books.

Fourth, you learn to tie the Machine Learning to Math to Statistics using the R Language in your chosen IDE for the complete picture.

Suffice to say, the course is a challenge even for those working with data for many years.  Perhaps that is why Data Science is a challenge and the number of qualified technical talent is below what's needed.

With that said, I believe Machine Learning is the hot topic today, even surpassing Big Data.  When combined, it's quite an arsenal of skills.  Plus working with data files and databases.  And translating business questions to projects that generate outcomes to produce insights viewable in Data Visualizations.  And the Models generated for re-use and combined with real time calls from web applications to produce statistical probability is kind of huge.

I'm going to continue to go through the course and learn as much as possible.  Realistically, just trying to get the big picture as a foundation, then fill in the holes over time.

And there you have it~!

2/13/2017

Intro to Visual Studio 2015 R Ingesting Data Files

This blog post is second in a series on Intro to Visual Studio 2015 R Project.  In this blog we'll discuss data and how to connect to files.
 
We start off in the Visual Studio 2015 IDE.  From the R Interactive window, we enter the command:
 
getwd() to get the "working directory" of our project.
 
Once we know that, we can add our files to the folder, or we can set the working directory using the command:
 
setwd("C:\Users\jbloom\Desktop\Statistics\data")
 
Now we defined our folder, we can add our files here.
 
Back in the IDE, we enter our command to ingest the contents of our file.  In this case, we start off with a CSV or comma separated value file
 
mydata = read.csv("Income1.csv")
 
The variable "mydata" contains the contents of our file.  If we want to view the contents, we type the command:
 
fix(mydata)

This spawns a new pop up window displaying the data:


You can double click a cell, change the value by copy, paste or delete, or typing manually, close the window, reopen, you'll see the new value(s) still appear.

Now we try a .txt file:

Auto = read.table("Auto.txt", header = T, na.strings = "?")

fix(Auto)


To see it's dimensions, type:

dim(Auto)

we get back:

[1] 397 9

To see the first 4 rows:

Auto[1:4,]


To get rid of Null Values (NA):

Auto = na.omit(Auto)

[1] 392 9

You may notice 7 less rows as before due to the Null rows.

To display the field names:

names(Auto)

We get:


[1] "mpg" "cylinders" "displacement" "horsepower" "weight" "acceleration" "year" "origin"
[9] "name"    

And that sums up our Intro to Visual Studio 2015 R Ingest Data Files.
 
To reference the first blog in the series, check out:

http://www.bloomconsultingbi.com/2017/02/intro-to-visual-studio-2015-r-project.html

Thanks for reading~!

Intro to Visual Studio 2015 R Project

Last Friday I attended a team meeting where the presenter spoke on R for Microsoft.  I had seen a similar talk a few months ago during the same team meeting.  However, I didn't realize you can build R projects in Visual Studio.
 
So to get started, download the R framework for Visual Studio 2015 here.
 
Run the package, then open Visual Studio 2015 IDE, click new project, you'll a new entry for "R":
 
 
Create the project, you see the Solution was created:
 
 
In the R Interactive window, you can do Help by entering ? and then the question:
 
 
 
The Help window display information on Array:
 

 
Attempted to create a Vector, which is similar to an Array, similar to a Collection back in our Visual Basic days, unbound in that it has no predefined max limits:
 
 
Defined the variable "x" equal to a Vector "c" with values 1,2,3,45.  Then enter.  To read back the contents of Vector x, simply type x then enter.
 
Calculations can be performed using our stored Vector x, here we declare variable "y" equal to Vector x & Vector x.  By doing so, it knows the number of values are the same, and multiplies x.1 * x.1 or 1 * 1.  Then x.2 * x.2 = 4.  x.3 * x.3 = 9 and so on:
 
 
We can display the length of our Vectors with the length() command:
 
 
To List our variables type ls():
 
 
To remove our variables type rm(x,y):
 
 
Next, we can declare a Matrix, what is a Matrix, ? matrix
 
 
And to assign a new variable x to a Matrix, 2 columns and 2 rows with value 1,2,3,4;
 
 
 
Square root:
 
 
And finally, from the sample code from the Stanford Statistical Learning class I'm attending:
 
 
Next, we can plot to the screen, create 10 random x and 10 random y:
 
 
Results:
 
 
Add labels "p" for plot, "l" for lines, "b" for both:
 
 
And that sums up our Intro to Visual Studio 2015 R Project.
 
 
Thanks for reading~!


2/12/2017

Expanding Technology & Production Support vs Development

Here's 48 new technologies you'll need to learn in order to stay current.  Of course, you'll never program in them, as your org is a few releases behind the current version.

But seriously, it's more like 52 new technologies.

Looking back, I've always been in pursuit of cutting edge technology.  The brutal honesty is you program in the languages that your company programs in.

Looking for x # of years in y technology.  Well, I learned it on my own, but haven't programmed in production.  Or you can do what most people do, just make something up.  Yah, I have x years in that language.

The other thing, you also program in the level of technology that you live in.  If you're in Silicon Valley or a hot region, good chance you'll have opportunity to program in cool stuff.  Once learned, probably other orgs to jump to.  Or the opposite, nobody in the region is doing cutting edge stuff.

Another view point, production support.  How many dev's like to support other people's code for a living?  Hello, anybody there?  Nobody likes to support other people's code.

I don't mind production support actually.  I like to step through code and see what it's doing.  Most of my career has been support actually.  Mixed with enhancements to existing apps.  And some new development.

The last 5 years or so has been new development.  Because I've been consulting.  Writing new code for clients.  In fact, last year I wrote Crystal Reports and Business Objects for one client, full life cycle Microsoft Business Intelligence solution for another, a full life cycle Hadoop & SQL Server BI solution for another and finally some new Tableau development.

This year has been mostly Tableau Production Support.  Full circle.

Anyway you slice it, you have to sell what the customer's are buying.  So you have to know a bit about everything.  So get busy learning the 52 new technologies, I'm fairly confident tomorrow there will be more to learn.

And so it goes~!

2/07/2017

Balancing the Books

Programming is fun.  Although I was hired to write Crystal Reports, I was tasked with writing the Month End reports.

Simply convert the Visual Basic code to T-SQL.  The original developer used inner and outer loops.  While stepping through the code, I noticed an error.  Raised it up the flag.  I was told those results were the basis of the underlying data on how the month end ran.  There's no way it could be incorrect.  Okay, thanks.

A few days later, I was asked to show the error.  There was a bug.

The month end was a series of reports based off a snapshot of the data.  Each month the data would get added.  And the numbers had to match.  Month to date.  Year to date.  Inception to date.  The numbers rarely matched the first go at it.

When the numbers were off, it was never a simple missing entry.  It was typically a series of errors when combined, caused the difference in numbers.  How did one find the offset, by looking through all the data.  Swimming in data.  Every month.

Perhaps a code change went in mid-month causes the bug.  Had to track it down, fix it and re-run the month end.  The month end ran for many hours due to the volume of data and number of reports.  It was run the last day of the month, after the close of business.  I'd kick off the process around 5pm when people were heading home.  And worked through the night.  I had 10 days to balance the book of business for two companies.  I did this for 3 years.  It was a great company.  They treated me well.  But I got burned out.  And resigned with no job lined up.  They offered to keep me on as consultant, but I needed a break.

What did I do during my burnout phase?  Nothing.  I would ride down to the pier on my bicycle.  Hang out by the ocean.  And smoke cigars.  There were some regulars down at the pier.  We started hanging out.  Since I had no schedule, we'd sit around and talk, fish, throw the cast net, grab some snacks at the convenience store.  It was awesome.

There was one guy there that I would chat with from time to time.  We got to talking about tennis.  I mentioned that I played years ago.  Why aren't you playing now?  I don't know.  Why not give it a try?  So I rode my bike down to the tennis club.  Walked in.  Can I play tennis?  Do you know how?  Yes.  Okay, we're having a round robin at 1pm, why don't you come back and we'll get you on the courts.

At 1pm I drove over with some old rackets.  Introduced and headed out to play.  Was a bit rusty.  Got the hang of it again.  I seemed to win every round and made it to the top court.  Afterword, the owner of the club came over and asked if I wanted to join the club.  Sure.  Okay, we'll set you up on the Monday night men's league.  I partnered with a nice guy, who owned a software company.  We won the first season.  And the second.  He offered me a job to write Crystal Reports.  I ended up not taking the job.

One day a lady from another club wanted to warm up so we hit for a while.  She asked why I wasn't playing better players and offered to get matches with some of the better players in town.  So I signed up for a singles league at a level one below the very top.  People said it was too difficult a division and I wouldn't win any matches.  Sure enough, the first match was on red clay and the guy was good.  I was down 0-6, 0-5, one game from defeat.  During the changeover,  I placed a towel over my head and the reality sunk in.  I decided to loosen up and see what happens.  I went out and won a game.  Then another.  Then the set.  Tied at 1-1.  Then went on to win the next set 6-0.

From there, I won all the matches for the season.  And then won the playoffs to win a free pair of sneakers.  I spoke with the newspaper about the win and it appeared in the paper.

I started playing tournaments and got a state ranking.  And played a few national tournaments and got national ranking.  And then found somebody to let me teach.  They sent me to get certified as tennis instructor.  So that's what I did.  Taught tennis and did websites on the side.  And hung out down at the pier.

I moved into a new place.  My neighbor had a golden retriever and I would offer the dog snacks.  Soon I removed a few boards in the backyard fence so the dog could travel between the two villas.  Then one day, a few of the neighbors were going down to the pier to watch a concert.  Just as we were about to walk down, we looked up, a double rainbow.

When we got there, most of the neighbors took off and it was just me and my neighbor who owned the dog.  I offered to buy her a hot dog and soda and we ate on a bench and listened to the music.  Then we walked home and chatted the entire way.  It was sort of late when we got home.  I saw on the computer that a good movie was playing and knocked on my neighbor's door.  Would you like to go see a movie?  So we went.  Our first date.

After some time dating, I proposed and she said yes.  So we got married.  That's when I decided to go back to work as computer programmer.  I found a consulting job which lasted a few months.  That job ended when the company got bought out by one of their clients, I found another gig writing Crystal Reports.  Then got hired to write Java code.  And I've been programming ever since.

None of this was planned out.  Sort of just happened.  That job I had balancing the books was a good learning experience.  There was a lot more responsibility than just writing code.  In hindsight, I probably should have gotten a backup in order to take time off occasionally.  Although if I didn't get burned out, things surely would have turned out different.

And there you have it~!

2/03/2017

Attending Statistical Learning Online Course

Although I programmed a computer in the early 1980s on the IBM PC, I didn't major in computers in college.  After graduation, I gravitated towards computers via Reporting.  Specifically Crystal Reports.

Back in 1995, there were no resources available to learning Crystal Reports.  No online books, no User Groups, no Forums.  How did I learn?  We had a consultant in town from one of the bank acquisitions and we got a conference room for an entire day.  We went over every single feature in Crystal Reports developer 5.0 and I picked his brain.  That was my formal training.

Data Science wasn't a thing then, the internet was just ramping up.  Machine Learning as an occupation didn't really exist.  Although I knew about statistical models.  As I was a bank underwriter and we moved over to a new software package that used Models to predict outcomes based off age, gender, years on job, years at residence, plus credit reports.  It based the model on history of data and the software costs a bundle.

Fast forward to today.  The amount of available online tutorials is staggering.  If you want to learn a new technology, go find a course.  Right now I'm attending the Statistical Learning course from Stanford University, no cost, self paced, free Statistics e-book plus examples.  Although the math is at a high level, you can learn a lot from the videos.  The nice part about this particular course, they utilize the R language (free + industry standard with 1000s of built in functions) rather than MatLab.

http://www-bcf.usc.edu/~gareth/ISL/

https://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/

During college, I was a business major sort of for the first 2 years.  And I attended Statistics class.  And I enjoyed that class a lot.  And tucked it away for 30 years.  Turns out Statistics is the bread and butter of Data Science.  And by that, you have to know what algorithms sets are available, what each do, how to apply, tune and analyze them.  Today's software does much of the heavy lifting so you don't necessarily have to know the underlying math.  

Statistics Learning to much degree is a black box.  Input.  Functions.  Output.  Regression.  Classification.  Clustering.  Supervised learning, where you have labeled data and derive a target outcome.  And Unsupervised Learning, data is not labeled, just try to organize into buckets.

Of course, you have to have clean non duplicated data, a valid question to answer, understanding of available statistical methods, how to apply and adjust as necessary, to formulate a result that is easy to interpret and fairly accurate.  Result shouldn't be too accurate or "over fit" and not inaccurate or 'under fit".  You want some degree of error in order to apply future data sets and get accurate results.

When I worked with Crystal Reports 1995, not many people were doing reporting for a living.  Basically a one eyed man in the land of the blind.  The market has grown, on steroids, and it's flooded with resources, vendors, tools, new markets emerged.  And it's all due to the fact that software is now in the hands of everyday people, improvements to hardware and processing power, availability of data sets and online courses to train the next generation.

In this field, we never stop learning.

And so it goes~!

Thanks for reading!

Mountain Living