Corporate Structure Similar to Chess Board

If you think about it, the corporate structure in any organization is similar to a chess board.

You have the King CEO at the top.  Very well protected.
The Queen, versatile yet powerful, your SVPs.
The Knights can jump and wonder all around the board, your middle managers.
Your Rooks, straight forward, deliberate with much strength, your supervisors.
And your Bishops, move diagonally, can zip across and capture like nobody's business.

And finally, the front line of pawns, your help desk support, your programmers, your sales & marketing and accounting.

The Pawns do most of the grunt work, are somewhat expendable, not much effort placed on them since they cross the entire board and are limited in their movement.  Some pawns can apply checkmate so can play a pivotal role in your org.

One team, working together with a common goal, perhaps.

So where do you fit in the chessboard of your org?


Who's Listening? #BigData

The thing is, had the powers that be went before the proper procedures and asked for permission in a public forum before those authorized to grant permission, it would never have happened.

That's why the "Data Tapping" occurred in secrecy without formal permission.

You see, it's easier to ask for forgiveness than permission.

And you cloak it around "National Security" who's going to question it.

And now that cat's out of the bag, who can stop it?

It's all happening behind the curtains.

So public figures are telling people that the "Data Tapping" is simply looking for patterns of communication.

And perhaps crossing the lines of surveillance by reading people's private emails, text, etc.

Big Data, if used for dark purposes, is a weapon.  Unfortunately there's a good chance this is the case.  Even more unfortunate, they're using it against the people they are supposed to be protecting.


Self Service Quality Assurance Team

Self Service is the newest thing since slice bread.

You can slice it, dice it, you can puree it, boil it, toast and even grill the data you need to identify insights to help run your business.

Uh, one question when you have a minute.

When you were mashing up your data sources, joining disparate data sets and creating pretty dashboards, did you happen to have the Self Service Quality Assurance team look over your numbers.

Because I last heard that your cool Visualization was just presented to the CEO and Board of Directors, and it seems they have some concern about the numbers.

You see, we have an entire Business Intelligence department who went to school for years to train, get a degree, attend seminars on weekends, read blogs and are very analytical, and they have the official numbers for the organization.

Your Visualization created in this fancy expensive software, although pretty and colorful, is void of any accuracies and contains flagrant errors.

Although you created it in between meetings, running the business and between sales calls, your effort is commendable and highly appreciated.

Just one thing though, don't ever publish your reports / dashboards to the corporate intranet site without permission.

Have a great day!

What if the Internet Went Down?

Here's something.

What if the entire internet went down?

For an entire day.

Boy that sure would be havoc and mayhem.

It's just a hypothetical question.  I don't see how it possible could.

But what if it did.  We have been adapted to the internet like a frog slowly boiled in water.  Got hot when we weren't looking.

And now we can't live without it.  Sure there's some people who stay away from computers and they'll never adapt.  Over time this number will dwindle.

Our entire lives are dependent on this free network full of massive information connecting people globally.

Could you possibly imagine the internet going down for an entire day across the entire globe?

Data is Alive

Data is alive.  It's a living thing.  It grows.  It multiplies.  It has depth.  Structure.  Free form.  Unlimited capacity.

People are hung up on the term Big Data.  If you think about it, it's actually "Macro Data".  And it's opposite is "Micro Data".  Similar to the structure of the universe.

You add up all the Micro to derive the Macro.

You can view data through a microscope or a telescope.

That would describe the "Volume" of data.

However, there's also "Velocity".  How fast is the data growing.  What is the influx rate at which the data increases in size.

Lastly there's "Variety".  And to me this seems to be the best attribute of data.  What we are really after is "Insight".  In order to derive the most insight, it helps to have lots of data or speed of new data, but the key factor in generating insight is Variety.

What we are really after is the "Holistic" view of the data.  We don't just want to know the frequency in which you called the Support Center, and the reason, how long it took to solve the problem, by whom, etc.

We want to know what products you already own, so we can tailor your customer experience to supply additional products.  We want to know you bank account info to know how much you can afford.  We want to know your purchasing habits to see how frequent you buy.  We want to know the demographics of your company, your location, your partners, your investors, your employee representation, you political beliefs, you religious beliefs, how many kids you have, did you go to college, what kind of car do you drive, do you drink milk, wine or whiskey, where did you attend high school, who were your teachers, what were your grades.

In the world of data, the more data sets available are better.  And integrating those data sets is the key.

Because every database has their own unique identifiers.  Mashing data is perhaps the most logistically challenging aspect.

I'm a firm believe if want to get good at something, watch how the experts do it.  And who are the data experts in my opinion.  The NSA.

They have access to all data.  They can mash it up.  They can keep mega warehouses of everything about you, in real time.  Their goal is National Security (perhaps).

Except they are using data as it was meant to be.  To unite the Micro Data to form Macro Data to create Holistic Data, that which comprises All Data.

To create a blueprint, a hierarchy, a network, a risk factor.  A neural network of everything, connecting all the dots, dotting every I and crossing every T.

To simulate the Universal Mind.  http://www.mind-your-reality.com/universal_mind.html

Like any tool it can be used to unite, heal, and grow.  Or can be used for dark purposes.

Either way, the goal is the same.  Holistic knowledge based on All Data available at any given time.

The insight part is more difficult, because bias, experience, motives get in the way.

Like an Artist there is room for interpretation.  Like a Scientist, there is logical practical approach to solving problems.

And the blend is done through the department know as the Chief Data Officer, the go between for IT and the Business, who reports directly to the CEO, who steers the organization Business Intelligence program, who can leverage internal employees or contract out, is responsible for data quality and data governance, who stays current with Data trends, who can hire Data Scientist, who knows the business and includes Resident Business Process experts, and will be the most important position in the org.

We are witnessing the Data Revolution, where every action will be recorded electronically, for mining purposes.

There is no more land to concur and plunder (on this planet), we've used up most of the natural resources, the next logical step is to create a new resource to mine, and that is Data.

And the Data is Alive.


Azure SQL Reporting Services In the Cloud Going Away

Recently I saw this article in the social stratosphere.

Windows Azure SQL Reporting Services will be going away.


However, Microsoft has provided an alternative, running SSRS Reporting Services on a VM in the Cloud.

I feel that approach gives the user more flexibility and more control.  Spin up the server when you need it, shut it down when you don't.

And there's less of a learning curve become most developers already know SSRS and how to maintain the server.

Except what message does that send to the community, how important is Reporting Services going forward.  What will take it's place or will there be much development to SSRS going forward.

Perhaps the new PowerBI integrated into SharePoint in Office 365 may be a new direction.

At this point, current Azure SQL Reporting Service users have some time to migrate off the cloud.

There's always change, right?


Successful Project

For our latest project, we were tasked with performance issues using Fuzzy Logic in SSIS.

Apparently, Fuzzy Grouping and Fuzzy Lookup tax the server heavily.

Both components do a great job for small data sets, except when the number increase to let's say 13,000,000 rows.

What it does it store everything in TempDB temp tables, using cursors (uhg!) and the package we were supposed to troubleshoot looked 16 times x 4 for 48 passes over the 13 million rows.

It ran in 5 days before we got there.  For both Fuzzy Grouping and Fuzzy Lookup.

Our plan of attack, investigate everything.

Hardware, software, SQL Server, Memory usage, Server configuration, background processes, VM settings, Host settings, Indexes, Queries, Locks, Blocks, Latches, Memory allocation, Paritioned tables, tweaks to SSIS components, Parallelization, etc., etc.

The first week we investigated, base lined and documented.  Second week also.  Then we provided a document to the Client with recommendations and findings and suggestions.

Third week we tried various combinations of suggestions.  Fourth week, we were still getting errors while running the package.  We ran Perfmon each run to gather metrics.

At the end of the fourth week of the contract, we finally had some success.  A combination of all the recommendations reduced the package runtime to just over 11 hours, down from 5 days.

Our job was to get it under 24 hours so they could run it twice on the weekend if necessary.

So we succeeded with the contract in budget and on time.

I still think that Fuzzy logic is not the most efficient tool in the shed, however, it serves a purpose.

I enjoyed the contract.  And I believe next week I'll start with a new client, building a Data Warehouse from scratch with some Crystal Reports.  Can't wait!

Presenting to Audiences

I presented two sessions this past weekend at SQL Saturday Tampa BI / Big Data edition.  One on Big Data and one on Office365 PowerBI.

A few weeks prior I presented on Intro to SQL and Big Data at the local IT Pro Camp.

So I was thinking, am I an expert in each of these topics?

What is an expert? 

Because someone who claims to be one usually isn't.  Same with self proclaimed Visionary's.

Big Data is hot right now and the industry is constantly changing.  Surely there are people who know the ins and outs and minute details at a deeper level.

PowerBI is brand new so the knowledge I do have from experimentation is okay to present to those who haven't been exposed to it.

And my Intro to SQL was very basic, however when asked by someone in the audience, does it get more complicated, I responded that these are the basics, like notes on a scale, you can play twinkle twinkle or you can play Mozart, depending on your skill level.

I wouldn't consider myself an expert at anything really, I know what I know, and there is so much to learn about everything.

However, I feel I know enough about a variety of topics to speak intelligently to an audience for a period of time.

I went looking for the guy who knew everything about everything and I still haven't found him yet.  In the meantime, I'll gladly discuss the knowledge I do have and give back to the community.


#SQLSat248 Tampa BI - Big Data Edition Recap

SQL Saturday #248 was yesterday.  It was a great event.  I was a volunteer and presenter and I attended 2 of the pre-conferences.

On Thursday, Bill Pearson showed us Power Pivot.  I have to say Bill is a great presenter and knows his material well.  Articulate and a great story teller.

On Friday, I attended Tim Mitchell's SSIS presentation, a definite SSIS wizard.

Saturday the day started at 5am, picking up the Krispy Kreme donuts and bringing some of the supplies to USF College of Business Building including 12 cases of water, coffee cups, 10 dozen cookies, 4 cases of chips, etc.  The volunteers pitched in getting everything set up and checking people in, did a great job.

I presented on Big Data at the 10am hour.  I had two Virtual Machines running in Hyper-V.  Hortonworks Sandbox 2.0 and Hortonworks HDP 1.3 for Windows.  I went through some slides and then a demo on Hive, Hue, moving data from Hadoop using ODBC through SSIS and then through Excel Power Query.  Overall I was happy with the presentation.

Lunch was great, tacos from Tia's Tex Mex.

Then I was a bit nervous about the 4pm presentation I was giving on PowerBI.  First off, half the lecture was doing the demo, connecting to the Office365 in the Cloud.  To prepare I charged my Verizon MiFi in order to have good connection.  However while presenting the demo, the connection got lost and there was about 5 minutes of "dead air".  The audience was patient with me until connection was restored.  I showed off Power Map, Q&A, Power Query to Hadoop HDFS, creating a Gateway and Data Source to an On-Premise SQL Server 2012 database.  Granted some of PowerBI is still in "Preview" mode.  The audience asked some questions, seemed interested in the topic, maybe next time I'll have a backup plan in case the internet connection goes down again.

Then it was raffle time, many Vendors giving away some great stuff.  There was good energy throughout the day.  I manned the Agile Bay booth on and off so I got to speak to many of the attendees.  And I chatted with the volunteers while getting ready to present.  And then it was cleanup time which went quick and easy.

I thought the venue was terrific, many USF students attended and the presentation by USF professor on Big Data was worthwhile. I had a chance to speak with him afterword's we discussed Big Data present and future, it seems they do a lot of work with Business to analyze data and write papers.

And a big "Shout Out" to Jose Chinchilla for another exceptional SQL Saturday Tampa Bay BI / Big Data edition!

Everyone get ready for the next SQL Saturday in Tampa February 2014:



Top 10 Work Related Observations

  1. Workers on the lower rungs carry most of the heavy lifting for fraction of the pay
  2. Many full time employers hire skills from the outside instead of training from within
  3. In some company's the only real way to get a big increase in salary is leave and come back
  4. When climbing the ladder, it really is who you know
  5. Most workers get labeled early in the career and it's tough to shed the stereotype
  6. Workers who excel in the technical space have difficulty moving up the ladder
  7. If you do great work, you'll end up doing most of the work
  8. It's best not to accept a counter offer when resigning from a position
  9. There may be more pressure at the top, but that's where the bonus' and stock options are
  10. Everyone is replacable


Store All Data in #Hadoop

If you've been keeping up with things lately, you'll notice that Hadoop is taking off like wildfire.

And why is that?  Because it can handle huge volumes of data, variety of different kinds of data and data acquired at rapid rates.

And how does it process data?  Batch oriented parallel processing across commodity hardware.

SQL on Hadoop
So the past year, all the excitement has surrounded SQL on Hadoop.  There 's one version called Impala which is fast because it completely by-passes Map-Reduce all together.

And there's another version called Stinger, which speeds up the queries by re-architecting some of the back end, preparing processes in advance so Map-Reduce doesn't have to spin up processes every time as well as leveraging memory.

So you can apply a metadata layer on top of HDFS, which resides in the HIVE data warehouse in H-Catalog.  The metadata is a framework to get the underlying data which can be called as Managed Data (data resides in HIVE workspace) or External where it's a pointer to the actual data.

What's Next
So what's next for Hadoop?  My guess is it's gaining ground on traditional Databases.  Perhaps Hadoop in the future will contain 'all data', including transactional.

The main barrier so far in my opinion is you can only insert data, you really can't do Updates or Deletes to the raw data in the files, you can only add to it.

There is work going on to make HiveSQL Ansi-compliant which means it will be very similar to traditional databases.  This will reduce the need for complex Map-Reduce jobs and allow more developers to get up to speed more quickly as well as leverage all the decades of experience writing SQL.

One Location
Think about the scenarios, if you already have the transactional data within Hadoop, there's less need for importing and exporting huge volumes of data, which will speed up development time.  And you won't have to structure the data on 'write', you can structure the data on 'read', if ever.

Hadoop and Business Intelligence
If you think about it, the Hadoop ecosystems contains just about ever facet of Business Intelligence today.  You can store data, cleanse data, ETL data, report on data, create dashboards on data, you can mine it, use it for predicting and clustering and you can machine learn with it.

The underlying processes for both traditional databases and Hadoop are similar.  The main difference, traditional databases max out at some point because of volume and processing power, and that's where Hadoop gets started.  So if Hadoop can handle lower volume transactional data, it can really do both functions, thus, less of a need for traditional database.  Perhaps it wouldn't extinguish them, just offer more functionality in a single ecosystem.

And we still use the mainframe today, as we use data warehousing, as we use traditional databases.  In the world of IT, nothing really goes away.  However Hadoop offers a lot of the things we need to work with data and it's gaining traction every day.

All Data
So the hype is actually turning into everyday processes and real people are getting up to speed quickly.  Time will tell how things pan out, as anything could happen.  Just saying that one day in the not too far future, Developers may be using Hadoop a lot more than they expected.