Data Lakes Require Data Captains of the Hadoop Ships

If Hadoop is a Data Lake, then Developers must have good boats in order to sail the rough seas.

Hadoop offers structured, semi and un-structured data.  Data can be added by a variety of means and not every developer is going to know where all the data resides, what it contains and how it integrates into the ecosystem.

Developers must rely on the meta data layer called HCatalog

In addition to that, the data types can vary for the same field depending the developer used Pig or Hive.

Data gets added at different intervals, batches run at different times and days, so knowing how fresh the data is may also prove to be a problem.  For example, your sales lead data may get updated hourly, but the financial data gets loaded on Tuesday and Thursday at midnight.  Consumers of the data will need to know the data refresh patterns.

And what if jobs fail.  You may think the data is running along fine, until a user calls in with a complaint the data is out of data on their reports.  Creating alerts may be necessary to prevent excessive tickets in the to-do list.

Many of these issues have already been addressed with current day Business Intelligence models.  Hadoop adds an extra layer and developers need to consider this.

The Data Lake adds value to organizations, as well as complexity.  So to stay ahead of the curve, the Data Captains of the Hadoop Ships must be cognizant of the extra layers and prepare for smooth sailing.


Technology is a Good Thing

So what's on the agenda for 2015?

Perhaps it's the year we inch closer to true artificial intelligence.  Machines that are self aware.

Perhaps we get more acquainted with nanotechnology  Wouldn't it be nice to produce any physical object by assembling the correct atoms in formation?

Perhaps we identify intelligent life in the Universe?  We'll have to tidy the place up a bit if we are expecting visitors.

Perhaps our robotics technology becomes more advanced.  Robots to perform tasks, more personalized assistants, automated cars and planes and buses.

Perhaps we develop flying cars, soar above the highways, like George Jetson!

We've already got 3D printing, even sent a wrench into space.

We have wearable devices.

And new Virtual Realities.

And sensors transmitting tons of data via frequency radio waves called the Internet of Things.  Going to spawn super mega data.

I believe there are two basic camps when it comes to the future.

Those who are pessimists, who see technology being used for personal gain.  Such as more control, surveillance, hoarding resources, throwing the average worker into the garbage with automation.

And those who see technology being used for good.  To allow more free time to do the things that are fun, for simplifying and streamlining boring processes, for uniting mankind and raising our awareness to a higher level.

And I suppose there are those who believe a bit of both.  To be honest, it's very difficult to predict the future, especially those things that haven't happened yet.

Personally, technology is a good thing.  Can be used for good or evil.  The choice is ours~!


Modeling Data

This week I modeled a database.  The client provided a series of reports.  I went through each of the reports.  Listed the fields.  Determined if they overlapped.  Put them into buckets of things or measures.

The things became the dimension tables.

The measures became the fact tables.

Within a short time I had a bus matrix, which connects the dims with the facts by fields.

And that took less than a day, based on a limited subset of reports.

So based on the quantity of reports the client decides they want on the first phase, that will determine which tables go into the Data Warehouse.  Which means they'll need to be mapped back to the source system.  Extracted, transformed and Loaded.

In addition, I had a few meetings with the software vendors who host the data in the cloud.  One company offers an ODBC Client which you can download and VPN into.

The other client offers connectivity through Web Services.  Luckily, Microsoft SSIS can connect with a standard component, with credentials.

Perhaps ours solution can be hosted in Azure VPN, with Active Directory.  And the reports in PowerBI is definitely a possibility.

So that would be a fun project.  We'll see what develops~!

My Intro to the Internet of Things

I don't know much about the internet of things.  Supposedly, tiny mini sensors will reside embedded in just about every possible device.

The sensor will emit, through micro radio frequency, back it's source, via the internet, using the v6 protocol, as we're bound to run out of valid IP Address'.

This will increase the volume of data incomprehensibly.  So they need to find better ways to store all this stuff.  So it can be accessed and harnessed.

That way, they can keep track of what's happening, what already happened, and what's going to happen.  A pulse of the system, every system, every second of the day.

I definitely see the upside, to be used in shipping, appliances, healthcare, traffic flow, etc.  We can surely benefit from monitoring all the systems in real time.  Providing they find a common framework to get this up and running, sooner than later.

What are the downsides, well, if they decide every person, plant and thing must have sensors, that could possible be an infringement of personal space and violation of our freedom.  Other than that, seems okay.

I'm not exactly sure who's working on this technology at the moment, other than the big company's with capitol to do research and development.  I wouldn't know where to begin to download some trial version and hook a sensor up to my three dogs, to track when they need potty break.

But I do think this technology will blossom over the years.  What I like about it is that it's primary use is not to generate sales or increase leads like Business Intelligence.  I think its more about control of the system into a centralized command center.  I see it run by the big company's, accessed by the govt to keep tabs, you know, for security sake.  Question is, who are the suspects.

I'd like to learn more about this new technology.  Seems like the future.

Automation of Jobs in the Near Future

In the Beginning

Before you can walk, you must learn to crawl.  So once humans developed the ability to walk, they probably learned how to run next.  But at some point, to become more efficient, a new form of mobilty was necessary.

Enter horses, stage left.  Horses gave us the ability to travel long distances, carry heavy loads, pull carts and work the fields on the farm.

However, as our never ending need to progress, Henry Ford invented the automobile.  We now had the ability to travel faster, farther and with less effort.  This opened up new jobs, such as car repair, road side gas stations, triple A services, etc.  Not sure if was good or bad, but the horses were not needed so much after that.

And we have progressed with our cars, integrated into our lives, become part of our personalities.  Except we now have traffic jams in every city, high gas prices, expensive insurance.

Technological Advances

On another note, technology has allowed us to transform from a hunter gather society to an agricultural to the industrial revolution.  Until now.  The recent advances in electronic technology has spawned countess occupations, streamlined processes, connected people and places as well as the mighty internet.

More recently, advanced computers have the ability to perform tasks once held by humans.  In fact, more and more jobs will become automated.  So with technology replaces jobs, the workforce won't be as strong.  More people will be out of work or taking lesser jobs.  This is a fact of life.  However, it won't be just the labor jobs to become automated, many white collar jobs will be on the chopping blocks as well.

If you look at the next generation, how they're progressing along, you'll notice they are not learning highly advanced critical thinking.  They are being forced fed answers to tests that have no significance or impact in their ability to sustain work once they become working age.  They are in a sense, mindless robots. 

Ah, robots.  Machines to perform tasks or services.  More and more, these devices will enter the workforce, to replace regular jobs.  They don't require health insurance, you can work them 24 hours a day 7 days a week, they don't take smoke breaks or lunch breaks or weekends off or vacations or holidays.  If you don't see this picture of the future, you're not looking closely enough.


Today's youth, once they reach adulthood, have the option of attending college.  College is a great way to get a degree to get your first job.  However, because the baby boomers forgot to save for retirement, they are forced to keep their jobs.  New adults have stiff competition to enter the workforce.  And college prices have probably tripled since I went to school, so students exit school with a six figure debt load.

Those who don't attend college, who don't have any real skills, will be battling for minimum wage jobs, scratching and clawing, maybe having multiple p/t jobs.

Technology will soon take many of the lower rung jobs.  However, in my opinion, I don't think all jobs will become assimilated.  There will be a need to keep the system running.  Which will require technology people.  But who knows, that could become automated as well.

I don't think there's any way to stop the current trend of automation.  Technology is everywhere, it provides us with ways to stay connected, and eases our lives tremendously.

But if half the population is out of work, you have to wonder where this is taking us, our culture, as homo sapiens, in our evolutionary flight.

Have we peaked as a society?  Are our best days behind us?  Where is civilization heading?  Will the economy support the current population with less jobs?  How will people survive?  Where do we go from here?

My Perspective

For me, I'm not too worried.  I think critical thinkers will still have a future.  If you can problem solve and get things done, there will be a need.  It's the people in the comfortable positions, with lots of tenure and overhead costs, perhaps those people need to do some deep thinking to find out how to best navigate the coming trend of automation and lack of jobs.

Change is coming.  Are you ready?


Service Broker Exchanging Messages Between Two Servers

On our plane ride back from Germany, 10+ hours, I played a game of electronic chess.  And I lost.  So I watched a movie or two.  Later in the flight I played again.  Figured nothing to lose, except losing of course.  I exchanged a few pieces, then lost my queen, figured the game was over.  But I hung in there, soon got their queen, and some more pieces, then exchanged my 2 paws for 2 queens, and I had the machine at check mate.

So on my latest project, the goal was to take a previous developers code and get it working.  And I got it working, even created some automated scripts.  Except the goal was to get Service Broker talking from one database to another on different database servers.

So I requested a few VM to be built, sure enough, got them pretty quick.  Except the same code didn't work across servers.  So I scoured the internet, reading error messages, looking at sample code.  So many settings.  User permissions.  Endpoints.  Routes.  Services.  Queues.  Certificates.  I spent 11 hours on it today.

And sure enough, at the eleventh hour, I found an error message about network issues, so I enabled the SQL Browser, modified a few settings and sure enough, the two database servers were exchanging messages back and forth, using certificates for security.  Check mate.

Quite a battle.  I almost gave up a few times.  So tomorrow I'll get the servers to talk in the other direction.  And then add in a 3rd server for a 6 way communication frenzy.

That was a tough one~!

Here's the code... http://www.bloomconsultingbi.com/2014/12/service-broker-code-example-end-to-end.html


Artificial Intelligence May Already Be Here

When we think about Artificial Intelligence and Robots, we naturally assume the Robots to assimilate humanoid characteristics.  A head that sits on a neck connected to shoulders with two arms and two legs.  With a brain that thinks and behaves similar to humans.

But perhaps we should not view Robots in this fashion.  Maybe we should disassociate the human form with that of the robot.  They may have wheels instead of legs.  They may have multiple appendages instead of two arms.  They could have eight like an octopus.  And why only two eyes, why not thousands like the bumble bee.  And why not have x-ray vision like the bats.

We would not want to limit the body type to mirror humans, as we were formed to survive based on current environmental conditions and have evolved over time.  Robots don't need to evolve, they can be built precisely from the beginning.

We currently have Intelligent devices in society today.  Have you ever stopped for a red light, waited a while, then drive when it turns green.  Is that not a form of Intelligent behavior?  What about IVR phone systems, they can interpret key pad and voice recognition and route your call accordingly, it can provide information and even perform services.  Is that not a form of Intelligence.

Why must we associate Artificial Intelligence with Robots which mimic human form?  You can have AI without a body at all, or a modified version.  Do we not already have derivatives of AI in action today?  I think we do.

And if you think about computers, with Neural Networks, which can be trained to understand past events and associate future behavior by applying weights to neurons and connecting them to other neurons.  It simulates an actual brain but lacks a certain 'Intuition' at this point in time, it uses pure logic.  Yet humans can do quick associates in the minds to derive conclusions, without knowing the actual steps they took to get there, a mixture of seeing pictures of events in their mind and recalling memories to produce a result.

So Neural Networks are a form of AI, so are pattern recognition systems, and predictive applications and having programs learn on the fly with no input.  They recently had a computer learn a few Atari games, on their own, by figuring out the rules, achieving better results than humans in some instances.

What would it take to have an AI system pass the Turing Test?  Ability to reason, process information, insinuate answers based on analogies, etc.  Doesn't say anything about physical form, just a way to receive questions in the form of input devices, a way to process that information to respond with the correct results in the form of output.

I don't necessarily think we need to mimic the human brain.  I think we need alternative ways to achieve the same results.  And that would be based on the current technology.  Of Big Data, Neural Networks, Deep Learning, fast Processors.  But in order to scale, it would need to have multiple Neural Networks tied together through a network.  Each Neural Network specializing in their domain knowledge expertise and then linking them all together.  So you could have one that knows plants.  One that knows Trees.  Another that knows Mammals.  Birds.  Reptiles.  Stock Markets.  Politics.  Religions.  Cars.  Boats.  Etc.  Link them all together for a conglomerated humongous Neural Network.  Constantly fed new information.  An All-Knowing Encyclopedia Brain like device.  You could call it Hal 9000 from Space Odyssey but that that one didn't go so well.  It wouldn't necessarily have a body, but it could.  You could connect Robots to the Brain like structure so they all have access to the same repository, in real time.  Kind of like the Borg, constantly assimilating information.

I think the definition of Artificial Intelligence should be re-clarified.  I think aiming to mimic the human brain is too low a standard, we should reach for something higher, and in order to do that we need to take alternative approaches.  We may never figure out how the brain works, how nature works, so in the meantime, let's figure out a way to get true AI by any means possible.

And here's a follow up post ... http://www.bloomconsultingbi.com/2015/02/ready-or-not-automation-and-robotics-on.html


When the Robots do Rise

Will the computers rise on planet Earth to form a new race of robots.

Well, maybe.  There's a good chance that robots will become part of our culture at some point.

And when they do, are they going to be equal to humans?  Or servants?  Or will they squash us out of existence?

It could go either way.  If you look to the future, through the lens of the movies, I'd like our robots to be like those of Star Wars.  Have a watch: http://youtu.be/OlpOaaM0wV0

I think robots should work along side humans.  They are in fact better at many things.  Perhaps stronger physically, they probably don't require sleep, they can do math computations as good as any computer, better in fact, if they have built in neural networks, they don't take smoke breaks, as seen in the video above, and they don't require vacation or health care.

So why shouldn't they be part of society?  Should they be entitled to vote?  What if they die, is there a funeral?  What if they commit a crime, or kill someone?  What if your robot gets stolen or abducted?  Can they reproduce?  If so, are they responsible for the care of their youth until they graduate from High School?  Can they be bought and sold?  Do they have birth certificates?  Can you use the left over pieces of a dead robot to form or repair another robot?  How often are they serviced?  Do they require reboots every day, month, year?  Are they connected to other robots like the Borg of Star Trek?  Can they get married?  Divorced?  Do they have gender?  Race?  Ethnicity?  Is there age based on human years?  Can they be upgraded if they are obsolete?  Can you euthanize a robot?  Do robots get paid for services rendered?  Do robots have different class levels?  Will robots have mid life crisis?

These are all considerations that must be addressed, sooner rather than later.

Now what if they robots are not so nice.  What if they become like robocop where they become the police force?  What if they are designed for military purpose?  Or used to keep humans in line?  What if the computers rise above humans and decide to eliminate the bipedal homo sapien race?  Bad, very bad idea.  Possible though.  If we aren't careful.

Every day we get closer to the reality of robotics and artificial intelligence.  I do believe robots will be created, they will be integrated into our society, and it's up to us, now, to determine their role and purpose and rights and responsibilities.

It would be a terrible shame if we go through the effort to create these beings, and they in turn, wipe us off the planet.  Wouldn't you agree?

My Ideal Profession #AI or Think Tank

What do I do for a living.  I type onto this keyboard thing.  Create programs.  Solve problems.  Provide value.

I do that working as a consultant.  Because many organizations have a need to view their data in a variety of ways.

My real passion is to work on advance topics, such as Artificial Intelligence.  I like that topic as it consists of many subtopics.  And the fact that it's profound problem, one that has existed for over half a century.  And some of the greatest minds have worked on solving it.

I had a chance to watch some video's from an MIT professor, about 6 or 7 months ago.  I got about half way through, and then my full time job took more effort and I lacked the free time to explore further.

The other thing I've always wanted to do was to work in a 'think tank'.  Being around bright minds, having the freedom to roam different ideas, find solutions using unorthodox means, being challenged intellectually.

Don't get me wrong, I'm challenged on my daily job every single day.  But the challenge is more to meet deadlines, complete tasks in specified time, produce a working product, meet with clients, etc.

I think that solving real life problems, using any means available, for the greater good seems like a fun thing to do.  And having fun is the key to creativity.  Having a curious, playful mind, is a great asset and it would be great if there were more opportunities to explore different avenues where original thinking were an asset, not a liability.  The business world prefers cookie cutter mentality.

So to summarize, computer programming allows as much creative thinking possible, to earn a descent living.  However, it I had my choice, I'd greatly like to work in a 'think tank' setting, where originality, creativity, problem solving were the core functions.  And my main topic of interest is that of Artificial Intelligence, a very challenging topic, which deserves the attention of critical thinkers who can also switch over to free flowing creativity.

And that's what I think~!

Watched a good video: Richard Feynman Computer Heuristics Lecture

I excelled in mathematics in the 2nd grade and was moved to the 3rd grade for advanced math.

In the 5th or 6th grade, I solved the Rubics Cube.

In the 8th or 9th grade, I started with computers.  Learned on the original IBM PC at home and TRS-80 at school.  It was cutting edge back then.

I've had a foundation in math and logic, learned at an early age.  So I have an interest in learning about smart people and their ideas and how they relate to modern day issues.

So today I found a good YouTube video on one of my favorite scientist, Richard Feynman.  He worked on the Manhattan Project building the atomic bomb.  A well respected Mathematician and Physicist who won the Nobel Prize.

What's interesting, later in his life, he displayed a knowledge and interest in Computers, as seen in this video:

Richard Feynman Computer Heuristics Lecture: http://youtu.be/EKWGGDXe5MA

Explains computers as advanced filing systems, that follow instructions, dumbed down at the lowest level.

He has opinions about if computers can think like humans, he says no.  They work efficiently.  However, they don't think.  They are going to perform same functions as humans, do them better, faster and differently.  However the result is the same.

 Computers can do some things better than humans, like work with lots of numbers, manipulate them, etc.  However, humans can recognize things, patterns, has not been put into a procedure, keep in mind, this video was done in 1985.  We do have pattern recognition at this time in modern day, but not perfected.

Are computers harmful?  Is a knife harmful, potentially yes.  Always has been and always will be.  The question is who has the knives and how its used.  And the attitude of the leaders of the society.

Listen here: http://youtu.be/EKWGGDXe5MA?t=1h2m10s

And a comment on Big Brother: http://youtu.be/EKWGGDXe5MA?t=1h3m26s

So the rise of information has been a concern for a very long time, its nothing new.  So will the development of Artificial Intelligence destroy mankind or give rise to Big Brother, time will tell.

I've listened to some of Feynman's earlier work and later videos.  What he emphasizes is to have fun, make things interesting, make work playful.  That keeps you interested in the subject at hand.  He also has a ferocious appetite for solving problems and likes to have dedicated time of solitude to work on problems.  He likes to take hypothetical problem and turn them into real world problems, to keep him interested.

At the end, he does believe that we are closer to getting to computer to think, however it exposes weaknesses in human thinking.

I think the guy is a genius~!


Hadoop Could Read from the Operating System

HDFS is a file system that sits on top of the operating system.  You store files in the HDFS so Hadoop can recognize them.  The Metadata goes into HCatalog, so future programmers know what data resides where.

Why not have a Hadoop like system read a layer below HDFS, the actual file system.  Let it read all the contents of your computer, self index, and make that data available through Hadoop.

Then you can link together all computers in your organization, in real time.  I don't see why not.  Then Hadoop can read in all files of every computer.  An internal brain of the organization.  People could search for stuff, do analytics, could server as a file backup system, archived forever.  And why not add the email exchange server as well.  All email that flows through an org is owned by the org, so technically its property of the org.  Perhaps lock down specific users folders, for managers and VPs and such for sensitive information.

I don't see why the world of Hadoop should be limited to reading from just the HDFS system.  Open up the hard drives of the operating systems as well for greater insight.

Artificial Intelligence and Our Moral Obligations

The Manhattan project brought together the best and brightest minds of the generation.  Their goal was to assemble the 'atomic' bomb.  The scientist involved were held in the highest regard for their knowledge and expertise.  They were focused on completing the assigned project.  Which may or may not have left time to look further downstream.  As to the moral implications of their work.

When the bombs were eventually dropped, celebration ensued the masses, America had won the war and displayed their mighty power for all the world to see.  How did the scientist feel about their undertaking.  We don't know really, as many or most have since died off.  We do know that many of them were pacifist and strong opponents to war and killing.  Yet they worked on the project regardless.

Which led us into the cold war with Russia.  And the nukes available today, we could potentially destroy the planet.

Artificial Intelligence has been around since the 1940s with its inventor, Alan Turing.  Although Charles Babbage created the first computer, Turning solved the problem of deciphering the German Enigma messages during the war.  He used logic as the basis of having a machine interpret instructions to solve problems which would take a human a very long time.  After he cracked the code and the war was over, he went on to architect his vision of the brain machine, called the Ace.  It took 5 years for the first prototype, so he is considered the father Artificial Intelligence.

Artificial Intelligence or AI as commonly referred to, did not make much progress during the 1970s, 80s and 90s.  However, due to the increase in processing power of computers, the large volumes of data, AI has been making a strong comeback.  We have build some massive Neural Networks which are sort of computer brains which you can teach, they remember events and outcomes and learn over time.

We have facial recognition systems, we can predict events in the future based on passed performance, we can answer many many questions as in IBM's Big Blue computer.  AI has made leaps and bounds.

But for a computer to really have AI capabilities, it must pass the Turing Test, which is to have judges ask the respondent a series of questions.  In order for the machine to pass, the judges must not be able to differentiate between the computer and a real person.  To this date, no computer has officially passed the test.

However, once it does, it will have crossed over into the Singularity.  When computers take a life of their own, when they have feelings, when they can determine meaning behind phrases, when they can interpret and display real emotions, they will give the humans a run for their money.  For one thing, robots with AI brains would displace many aspects of the bipedal apelike creature called the Homo Sapien.  Robots don't take vacations, smoke breaks, require insurance, can work 24/7, don't have to be fed, so you can clearly see, there are benefits in the rise of the Robotic AI beings.

Now some people will state that there is nothing to fear.  For me I'm all for new advents in technology.  However, like the scientists who created the A-Bomb, there are moral implications for creating a new race on planet earth.  And those obligations must not be tossed aside in profit at all costs.  Like the violinist who played while the Titanic ship sank, so too, are we allowing AI to be developed without some guidelines in place.  For when the time comes, in the not too distant future, we develop true AI, it will be too late.  As we all know, technology can be used for good or evil, depending on who's in charge.  If Human are to remain the alpha on planet earth, we really should go with caution on developing Artificial Intelligence and a new race of Robots who could replace us.

And there you have it, my two cents worth.

My Intro to SQL Server Service Broker

SQL Server Service Broker is a tool developed by Microsoft for exchanging messages.  It's built into every version of SQL Server including Azure.  It runs by configuring many objects and then issuing the Send and Receive methods.  Being that it's built into SQL Server, it can run stand alone or connect to external servers within or outside the network.  Messages can be encrypted.  They have an expiration date.  They are based on the smallest unit of work called the  conversation.  Conversations can be ru-used to increase performance.

Objects consist of crating a master key encrypted by password, creating EndPoints on each remote server, Routes which are the road map to guide the message to its destination, Services, Queues, Contracts and Message Types.  They are listed in highest to smallest level of magnitude and are usually created in specific order as some objects have dependencies to other objects.

You can also create and specify Keys, which get created and then swapped between machines to protect who can send and receive messages.  Message Types are used to specify the structural content of the messages.  You can actually send binary files as messages.  A contract specifies who and what can be exchanged.

Messages are sent in specific order as well.  And being an asynchronous process, when the messages arrive at the destination server, they are stored in a Queue, and processed in the order in which they were sent.  If a destination server goes down for some reason, upon returning, you can start the process manually (default is not automatically resume processing) and the system will pick up where it left off, quite a nice feature.

Once the message is received, you must do something with it.  You can activate a Stored Procedure to pull in the received message, and do something with it.  Exchange between Source and Destination can be one message in one conversation or multiple messages within a single conversation or multiple messages within multiple conversations.  Upon completion of a conversation, it must be Ended by both servers, as it's a two phase commit.

If you were to try and build this type of system, it would be very difficult and probably not as thorough or efficient, because it's built into the Database are the core level.  However, there is no real interphase, no real way of indicating a message is corrupt or error-ed out and no real way of monitoring the activity.  This is quite a few downsides, in addition to the fact that Microsoft has never really marketed the application except perhaps to the world of DBAs.

It's a bit of a challenge to set up, as well as monitor, but if done properly, it's a fully functioning messaging system built into the SQL Server database which is highly scalable, efficient, asynchronous where messages are processed in the order in which they were sent, every single time, guaranteed.


Finding Insights from Data through Data Warehouse or Data Scientist

Scenario:  Try to guess what sport I'm talking about.

"He touched the ball 3 times"

Hmm.  Could be football, soccer, baseball, basketball, rugby, bowling.

It couldn't be archery, bad mitten, darts, wrestling.

We have enough information to deduce certain items but not enough to clarify entirely.

"He effortlessly passed the ball each time."

We can further reduce the list of possible sports by eliminating: Bowling, Baseball and Bowling.

Because we have more information to illuminate the underlying answer.

"His last touch, he scored with a header on goal."


Finally, we have enough information to know what sport: Soccer.

We had to interpret the data based on our current knowledge and understanding of the domain, to accurately assess the correct answer.

We turned information into knowledge, by asking questions, iterating over time, to enhance our interpretation, based on our current knowledge and experience.

We were never told specifically what sport was being played.  We used our intuition.


When company's initiate a project to build a data warehouse, they are building an infrastructure to align the data, to allow for interpretation, to find answers to questions, so they may act upon them, to streamline process, reduce costs or increase sales.

Getting to "action" is not built into the solution.  That takes human input.

If you don't have someone to interpret the data into insight, it's just a bunch of information.

You may know which sport is being played, you may reduce certain sports based on the data presented, but to find the correct answer, you will need to do the interpretation yourself.

Enter Data Scientist, Stage Left:

Data Scientists can provide that valuable insight.  His or her role is to align the data, create the reports AND do the analysis.

Going beyond the "Traditional Data Warehouse" role, they run algorithms to analyze the data, apply their domain knowledge to the data to recognize trends and make recommendations.

Likewise, a DW developer only has to understand the business flow well enough to model the data warehouse, apply business rules and possibly create Reports and Dashboards.  The interpretation is up to the Data Warehouse owners.

Data Warehouse vs. Data Scientist: 

Data Warehouse developers don't necessarily need to know Statistics, algorithms and mathematics, where a Data Scientist does.  When you hire a Data Scientist, you are essentially getting Analysis and Interpretation and Recommendations based on the current data sets and perhaps mashed with external data.

The Data Warehouse is kind of like doing your own taxes, you better have internal staff that knows the rules to look for deductions, follow the ever changing rules to be compliant.

Hiring a Data Scientist is like hiring a CPA, they know the intricate rules, they can assist in determining the best strategies by analyzing your financial picture for the prior year.  And make recommendations for future financial arrangements.

Nuggets of Gold:

Why do we go through the motions of building Data Warehouses and Analyzing data?  To find Insights.  Insights are the nuggets of gold, laying deep within your databases, waiting patiently to be mined, either by internal staff or a data scientist.  When mining for insights, you may find pieces of gold here or there, but you may also find veins of rich information, wouldn't that be nice.

So get digging~!


Basic .net Functionality Hasn't Changed Much

I've been working on my latest project.  Learning Service Broker.  So I inherited some documents which had all the commands scattered, which I was able to consolidate into 2 working scripts.

So now I'm able to send commands from a local SQL Server 2012 database, to another instance of SQL Server 2012, via Service Broker.  However, this is running locally on my laptop, so there are no ports to select, as Service Broker can only have one defined Endpoint.

Once I get the remote stand alone instances on VMs, I'll be able to configure each server to use it's own Endpoint and the scripts will need to be modified slightly.

From there, I'll need to configure all 3 VM SQL Servers to talk to each other, which equals 6 scripts in total.

And then there's a .net application written in c#.  Its job is to create random data for load testing purposes.  This morning, I ran the application, it ran successfully with no errors, except I don't have any data locally, so I can't see the results.

However, it's good to know that I can sit down, open a .net application, run it, see what's occurring, step through the code and troubleshoot if need be.  Reason being, I'm not really a .net programmer full time.  Press F11 to  step through code, F9 to place breakpoint, F5 to run till next breakpoint.  There's references, calls to the database via Connection Strings, etc.  These functions havn't changed much since the early 2000s.

It's 9:42am in Germany right now, looking at my laptop time, its 3:42am.  Think I'll head downstairs to the market to get some coffee.



Latest Power Query Connects to Analysis Services

Power Query for Microsoft Excel now supports connection to SQL Server Analysis Services cubes.

You can download the latest release here:

Upon loading Microsoft Excel, it will prompt you to close programs while it installs the new features.

Once loaded, you can see the Power Query tab across the top.  I'm using Excel 2013, so it was already there, just updated to a newer version.

We'd like to get some external data from our SSAS Multidimensional cube, so we click on the button "From Database Get External Data":

Then we select "From SQL Server Analysis Services".

We locate the name of our Analysis Services Server.

Next we're prompted to Use Your Windows credentials to access Analysis Services, click SAVE:

On the right hand side of Excel screen, you'll the Navigation pane containing our SSAS Multidimensional Cube:

We selected Measures and a Dimension and clicked the Load button, it starts chugging away, to pull the data into our Excel Workbook solution:

After a brief moment, whallah, we have data from our cube:

Now you can edit the query by clicking the Edit Query button, on the right hand side, you'll the list of steps used to pull in the data:

We then clicked the Settings button to the right of the Added Items and a window pops up:

Added another field called Orders Count and the field got added to the list of data:

By clicking the down arrow on any of the fields, a window pops up to sort the data, or remove specific entries:

Click the Close and Load button:

Excel re-loads the data containing the new field:

And the field now displays in our Workbook:

No MDX required!

We select the "Load to Data Model"... Click Load... we get a warning box:

Click continue, it loads data into a Model:

Then hovered over the Workbook Queries, right clicked, and said "Send to Data Catalog"

After supplying credentials, a screen pops up:

And the Sharing screen:

Notice the box in Yellow, '"This query uses a local data source and may not be available to others".

I publish and log into PowerBI.  Navigate to My Queries and sure enough, there it is...

I click the ellipses...

See the Analytics...

I don't have a Data Management Gateway established at this time, so that's as far as I'm going to take it at this point.

Hope you enjoyed and learned a bit.

Happy coding~!



Passed the 5th test today... Microsoft® Certified Solutions Expert: Business Intelligence

How About Cross Platform SQL Server?

In the world of Microsoft big new recently:

Opening up Visual Studio and .NET to Every Developer, Any Application: .NET Server Core open source and cross platform, Visual Studio Community 2013 and preview of Visual Studio 2015 and .NET 2015

For those programmers working with .Net, this is a huge new way of doing things.  Open Source.  Cross Platform.  This should really open avenues to bring in more developers from some of the 'free' languages.  Write once, deploy anywhere.

That got me to thinking, why not do the same with data.  How about SQL Server running Cross Platform?  On Linux and Unix operating Systems.

Only makes sense, MySQL has Cross Platform.

What do you think?  Possibility?


Next Project in the Queue

Service Broker for SQL Server.

I saw a demo recently at the SQL BI User group.

Now I get to work with the technology on my job.  Got a new project assigned this evening.  It involves taking a proof of concept and getting it to work in Production.  Also add a new endpoint.

It's basically a queue / messaging system built into all version of SQL Server, including Azure.

You send a message to one or more endpoints, by establishing a conversation, the message conforms to specific formats, after the conversation ends, it shuts down. 

And you can send multiple messages, in any order, and the receiving machine treats the entire batch as a transaction, which can roll back.

If the receiving machine goes away, the messages stack up in the sending machine.

That's the basics anyway.  I'm excited to have a project to bill hours against and we need to do a good job on this because we now own the process.  Which means I'll have work to do while in Germany the next two weeks.

And I get to work with people from other countries, like Germany and Australia.

Should be fun~!


Why Change Careers Every Few Years

I am not the greatest business intelligence person out there.  Nor the greatest programmer.  Nor the smartest.  I have trouble remember people's names, passwords and memorizing things.  Chances are I have a mild form of dyslexia.

So it would have made sense to learn something, get good at it, and stick with it.

I could have stuck with approving loans, as a credit analyst, branch manager, or a mortgage broker.

I could have stuck with digging holes as an Archaeologist.

I could have stuck with teaching tennis at a club.

I could have stuck with .net, had plenty of years with Visual Basic, ASP and .net.

I could have stuck with Crystal Reports, SAP Universes, XCelcius and Business Objects.

I could have stuck with with Java, JSP and Web Services.

I could have stuck with Oracle, Pl/SQL

I could have stuck with Supervising SSRS developers.

I could have stuck with SQL Server Reporting Services.

So why didn't I?  Each of these careers could have been sufficient to live a good life.

I guess it comes down to a few things, primarily, learning new skills.

If you're always learning, you don't get stale.

If you don't set challenges, you get bored.

I now build data warehouses, business intelligence solutions, SQL Server related technologies, Analysis Services Cubes, and Dashboards and perhaps some Hadoop.

Am I the greatest at each, no.

Descent?  Yeah.

But here's the skinny.

If I don't know something, I learn it, fast.

And I have other skills, soft skills, which I depend on.

Problem solving.


Driving projects home to completion, finishing the last 10%, shipping.


Attention to Detail.


Taking responsibility.

Accepting projects that others don't want.

Learning on the side.





My new job is another challenge along the path.  Are the required skills so complex, that if locked in a room for 10 years, I wouldn't be able to grasp and understand?  No.  I can learn whatever they throw at me.

So am I nervous about my new role as Business Intelligence Architect for a major consulting firm?

Sure.  However, based on the track record, I think it will turn out just fine.


Data Solutions have Incrementally Compounded

The world of data has fragmented and splintered into many tiny pieces.  Back in the 1990's as a report writer, we had data in Oracle or SQL Server and a GUI interface called Seagate Crystal Reports to pull the data, export it and send to the users.

Very simplistic.

Now we have a plethora of options, quite overwhelming.  If you were to look solely at the Microsoft offering alone, you'd quickly see at least 20 variations.

Storing the database, we have SQL Server 2014, Excel, Access, Blob Storage, DocumentDB and Hadoop.

Pulling the data, we have SSRS, Power View, Power Pivot, Power Query, T-SQL, Map Reduce, HiveSQL, Machine Learning, SSIS, MDX and Excel.

There's on-premise, Azure, Office365, PowerBI and Hybrid solutions.

So the variety of options can be customized to fit any client.  However, this creates an opportunity for Data Professionals to learn the gamut of offerings and stay up to speed on new developments.

My approach is to learn as much as possible to keep a wide view and focus on specific technologies as it becomes necessary.  The other approach is to become deeply knowledgeable of a few specific technologies.  These features listed above are the hard skills

The soft skills include project management, Agile Methodology, gathering specification, Networking, Active Directory, Virtual Networks, Visio diagrams, Documentation, attention to details, problem solving, etc.

And then there's other skills like completing the last 10% of a project, staying within budget, meeting client expectations, proper email etiquette, entering hours in timesheets, meeting new clients for potential sales.

And then there Architecture.  Recommending the best solution to specific clients based on needs.  Knowing hardware specification, hard drives, memory, virtual vs. physical, Cloud vs. On-Premise, Source Code repository solutions, Network, ITIL Methodology and Architecture frameworks.

So you can see there's quite a bit of a knowledge curve in getting up to speed in the world of data.  And the learning never ends.  It's a challenge and opportunity and a great way to earn a living.

Gone are the days of simple queries to pull data into a report to send to a client.  The Data Ecosystem has incrementally compounded for the better.


#Pentaho #Kettle #PDI CE Offering

There's a product from Pentaho to perform Extract, Transform and Load, ETL / ELT, which you can use on Linux or Windows, called Kettle.

There's the free Community Edition CE:

Download latest version of Kettle CE Community Edition:

As well as the Enterprise Edition with a 30 day trial:

I wanted to load on a Linux environment, so I downloaded a Hadoop VM from Hortonworks CentOS Sandbox version:

Link for installing Kettle:

So I logged onto my Hyper-V Hadoop Sandbox single node cluster as root/Hadoop.

Was logged in as root, created a new user called pentaho:
To sudo to the Pentaho user
$ su - pentaho

And back to root:
su - root

I created a folder in the root called /Pentaho

There's no easy way to copy the zip file from the host computer to the Hyper-V, so I used curl:

curl 'http://tcpdiag.dl.sourceforge.net/project/pentaho/Data%20Integration/5.2/pdi-ce-'

Checked to see the Zip file was downloaded as Archive file, not HTML:

To Unzip the Zip file, first install Unzip
yum install unzip -y from the command line on the Hadoop VM

Then key in the command:
unzip pdi-ce- -d /Pentaho


Files were loaded:

Next step is to get Kettle PDI running...

In the meantime, I wanted to explore the Windows version and sure enough the same bits work for both Linux and Windows, in my case, Windows 8.1.

You'll want to click on the spoon.bat file:

And within a few seconds, the app loads, no installation required:

There're existing Samples to get started:
There're tons of transformations:

And the thing to notice is the Big Data Transformations:
And some connection types:

To create a Database Connection to Hadoop, follow this URL instructions:

What I like about the Pentaho Kettle PDI solution is the ability to get installed and up to speed quickly.  Once the application is running, you have an arsenal of Extract, Transform and Load functionality at your disposal.  And the best part is there's little to no actual coding.  It's drag and drop WYSIWYG interface allows Rapid Application Development robust solutions.

This blog was how get started and some of the features on the Community Edition of Kettle PDI from Pentaho.

Happy coding~!


Microsoft Data Offerings in the Cloud

This week I spent some time learning more about the Microsoft Cloud offerings.  Azure has quite the number of tools for getting a fully functioning Data Warehouse running in a short time.

First, you can provision an Iaas (Infrastructure as a Service) SQL Database in Azure Cloud in about 4 seconds, that is, to provision it, not counting the time to create the Azure account, log in, etc.

You have a number of database options to choose from, what I liked was the SQL Server 2012 SP2 Data Warehouse version, designed for speed.  And they have 2014 version as well with many cool new features.

They also have the Paas (Platform as a Service) which allows you to create a VM in Azure, which you are free to load your own software.  You can provision a pre-configured Data Warehouse version of SQL Server using a Power Shell script, or you can load your own Database.

However, you must purchase or supply your own licenses as in Windows and SQL Server.  You may be able to port your On-Premise licenses, but they must remain in the cloud for at least 90 days I believe.

By loading SQL Server, you have at your disposal MOST of the OnPremise data tools such as SQL Server Data Tools to build SSIS, SSAS and SSRS solutions.  You have the SSISDB Catalog to store your SSIS packages, you have your SQL Agent, along with a fully functioning SQL Server Reporting Server SSRS web, with Data Subscriptions available.  And it pre-loads Analysis Multi Dimensional with the option of activating the Tabular Model.

It does not however have Performance Point.

And you can set up an FTP site to push your data files to the Cloud if you choose.  And you can activate Active Directory Federation Services to integrate your On Premise AD.

You can set up a Virtual Network to establish a seamless link to your On Premise users.

You can set up a Static IP Address.

What I like is the option of pointing your On Premise Excel to the Cloud for Power Pivot, Power View and  Power Map using 2013 or 2010 with the Add Ins.  It seems like a cost savings to get clients to the Cloud sooner than later with the option of adding on later.

And by adding on, that opens up the arsenal of Office365.  For a subscription price, you get SharePoint, full office online or available for download, Lync, integrate with Active Directory Federation Services and a ton of other stuff including 24/7 support, all depending on which version you select.

And to top that off, you can also purchase PowerBI subscription, which opens up Self Service offerings.  You can point to On Premise to pull data, without having to store the data in the cloud.  You can host Excel files in PowerBI, manually refresh or set up Automatic Refreshes.  And with the Data Management Gateway, you can securely link data to PowerBI.

And PowerBI will  point to your Azure OLTP data, and I believe they recently announced the option of pointing to Analysis Services, which is a real selling point to clients.

So I touched on a few of the basic offerings from Microsoft Cloud offering around Data and SQL Server and Office365 and PowerBI.  With the Cloud, you no longer have to host this stuff On Premise, which means faster time to production, faster insights, secured, with Active Directory integration, push and pulling of data and it all runs in the Cloud, which means Disaster Recovery, which is a major selling point as well.

I hope to get a client or two up into the Cloud to find out where the gotchas are and to better understand the pricing models as they have changed a few times and there are so many factors and options available.  Suffice to say, depending on your clients budget, you can really pick and choose custom architecture to satisfy almost any configuration.

And then throw in HDInsight, Blob Storage, Machine Learning, Microsoft is to say the least, got their stuff together with their Data offerings in the Cloud.


Microsoft .Net Opens Up

Believe it or not, my first programming language as a computer professional was Microsoft Visual Basic.  I worked with version 4, then 5, then 6.  That was around 1995 or so.  Kind of dates my age a bit, but nonetheless, I thought it was a great programming language.

And back then, Java was starting to build in popularity.  And if you compared VB to Java, you clearly saw the differences.  Java was truly Object Oriented.  It was a real programming language.  And that's what I wanted to program in.  Because it ran on any operating system, not just Windows.

And even back then, I wondered why Microsoft didn't port VB to other operating systems.  Why didn't they have a Microsoft Virtual Machine similar to Java Virtual Machine?  Because they were proprietary. And they had enough market share.

So you could create a procedure in VB, then convert it to a Function, then move all the functions to a Class, and bundle the Classes into a DLL, then put the DLL in Distributed DLL.  And that could be called from a new web language called Microsoft ASP.  I worked as an ASP developer for years, yet it wasn't Object Oriented.

Next was .net.  An entirely new IDE, new languages and the thing was, they had a language called Visual Basic.net.  Yet if you looked under the covers, the name was the same, and that was about it.  Because now it was Object Oriented and Microsoft said you must now learn this or else, no choice here.  Visual Basic was soon deprecated and they removed much of the knowledge base off the internet and there was no turning back.  Yet .net was still proprietary.

And now, November 2014, Microsoft has finally succumb to opening up the .net languages to Open Source.  And it will run on most operating systems.  Because of the new CEO, the changes had to come to compete with up and coming languages.  Or loose market share.

Was it a good choice?  You betcha.  I think it should have happened decades ago, better late than never tho.  The Cloud is now the operating system, and Mobile is the new PC, and if you want to compete in the new era, you have to let the programmers use your development software, it's basic supply and demand.  Give the languages to the masses, so they can host on your Azure platform and you clean up on all the paid services.

You see, Microsoft was late to the Reporting / Data world back in the mid 1990's when Seagate Crystal Reports was the only game in town.

Then they missed the boat on the Internet and had to play catchup.

Now Open Source is forcing .net to run on any platform as Open Source software.

I've worked as a Microsoft Developer for close to 20 years and I still believe they are as strong as ever.  I wouldn't count them out just yet.  They've got Windows, Office, .net, Azure, Gaming, HDInsight, Machine Learning, SQL Server, SharePoint, PowerBI, Office365, etc.

I'm just trying to keep up with all the changes, which never end.  I see this latest change as a win.  Better late than never.

Here's a prior post: Microsoft's Data Centric World View

One more: #Microsoft #MSBI Offerings Tough to Beat

Here's the article on the latest news...


Installing #HDInsight #Hadoop to Windows via Web Platform Installer

Today I'm going to document the installation of HDInsight Emulator on Windows.

First step is to go to the URL: http://azure.microsoft.com/en-us/documentation/articles/hdinsight-get-started-emulator/

This will get you started. 

In doing so, I downloaded Microsoft Web Platform Installer:

There are a few pre-requisites for loading:

So I began the Power Shell for Azure installation:

As you can see, Power Shell for Azure was loaded:

Then did a search in Web Platform Installer for HDInsight,  the entry appeared:

Clicked on the Add button... then the Install button...

Clicked "I accept" and the installation began..

This part took some time, to download and install all 5 sections... although the install job did all the work...

Then it configured and finished...

A few icons got loaded onto the Desktop...

And there are a few new directories in the c:\ root...

from the c:\hdp directory, I double clicked the Start _Local_HDP_Service.cmd...

Double Clicking the NameNode shortcut on the desktop, the web page appeared...

After clicking on YARN Services, the web page appeared...

From the Hadoop command line, instantiated the RunSmokeTest job, the job executed...

Which created some folders in the file directory...

Reading through the verbiage, many of the tests passed...

And running a Hadoop fs -ls get the following...

And there you have HDInsight up and running in less than a half hour on a Windows 8.1 machine.

Hope you enjoyed the blog and you can appreciate the ease in which it gets installed without having the tedious process of installing all the pre-requisites which Hadoop traditionally requires.

Thanks for reading~!


Mountain Living