Problem Solvers do it in the Details

What are some ways to solve a problem.

You can ignore it.  Simply pretend it doesn't exist.  Maybe someone else will take ownership and make it go away.

You can see this pattern at work and in daily life.

Why do people ignore problems.  Because problems are big hideous monsters that lurk in the unknown.  Scary.

Well, perhaps the problem is indeed scary.  But I'm not going to solve it, I just want to see the problem first.

How does one see the problem.

By observing.  Visually, you can take notes, mentally or write them down.  Looking at all aspects.  From different angles.

Next, look for connections.  Dependencies.  Disconnected and disjointed pieces.

Next, try to identify pieces you've seen before, maybe do some quick research on pieces never seen before,

Next, look for what' not there.  Why is it missing.  Should it be there.  What downstream effects could it cause by not being there.

Gather up as much information as possible, a holistic view.

Then, let it sit.  Let the mind rest.  Study the problem internally during the day when the mind is open and relaxed.  Let the ideas swirl around, in no particular order, from one thought to the next, not necessarily in specif order, actually better to jump around and flow through the ideas.

Try different mental simulations, what if scenarios, high level thinking, and low lever details, around and around.  Then let it sit.  Maybe research some more, maybe not.

Revisit the problem.  Go through all the steps again.  Observe and study.  Methodical approach.  When I do this, that happens.  Why or why not.  What does this do.  Why is this failing.  What could cause that.  Remove any pieces from the equation that are NOT the issue.

This is the art of brainstorming a problem.  Where there are NO WRONG ANSWERS.  Try, fail, analyze.  WHY.  Try, works sometimes, analyze.

At some point, not sure when, through analysis, thought, meticulous observation, attention to details and big picture, research and time, a relaxed mind will eventually find a conclusion that will be accurate and correct.  The answer will become convincingly apparent, and repeatable, and it will seem obvious, staring you in the face, almost heckling you, it was right in front of you the entire time, just waiting patiently, for it to be discovered.  And you and the problem will enjoy a brief moment of introduction, perhaps a wink issued by the problem, you found me.  Nice job.

People that do not solve problems, at work or personal life, are not aware of such practice.  Because its a heck of a lot easier to avoid problems, push under rug, throw over fence, pretend doesn't exist.  Yet this practice tends to amplify things, and repeat in life, until one finally faces the UNKNOWN.

The problem wasn't as scary as imagined.  It was a phantom, boogeyman, that lingered in the night.  Problem solving is an art as well as science, based on interrogation, observation, research, and throwing away all pre-conceived bias, rolling up the sleeves, and facing the problem head on.

Based on this narrative, one might expect the author to have no problems, having solved everything over time.  And one would be incorrect.  Some problems from long ago still remain, new problems to be solved develop over time, and perhaps create some new ones in the future.  Some are works in progress, tinker with during odd periods of the day, yet to be resolved.  Some problems have been solved and correct, some were indeed intricate and complex with multiple layers of dependencies and hooks, and took multiple decades to solve.

In life, we can count on change, and like any good video game, there's always something jumping out from nowhere, into our path, which we need to solve.  If you think you get to a certain place in life and everything is perfect, this is a false thinking not based in reality.  The fact of the matter is, it never ends.

Instead of avoiding problems to be solved, you can approach from a different angle.  Problems are simply opportunities, when faced, can dissolve into air, by simple confronting, studying, observing, analyzing, and letting the relaxed mind do what it does, sort of the black box approach, churning all details into patterns, anomalies and eventually solutions.  This applies to life as well as work.

If you can solve problems, chances are, you can find a career in technology, where your skills could be of value.  Because problem solvers remove blockages and get things flowing again.  And the world today, could sure benefit from more solutions to the growing number of problems.

So don't delay.  Sign up today.  Money back guarantee if not 100% satisfied.  Just 3 easy payments.  Not available in all states, some exclusions do apply, please contact a professional if problem persists for more than three decades.

The secret to problem solving is applying a playful approach to a rigorous analysis through observation.  And sometimes luck.

People often ask me, why didn't you do better in school.  My response, I sat next to the dumb kids in class.  If you're going to copy off someone, make sure they know the right answers.

Thanks for reading~!


Domain Knowledge is Key to Successful Data Projects

Data is a primary factor in decision making.  All shapes and sizes and formats.  Stand alone data is good, but its only part of the story.

What is domain knowledge?  Its how the business operates.  Perhaps industry specific, or company specific, or department specific.  All the rules that exist, including product information, customer information, market knowledge, legacy information, all under the domain knowledge umbrella.

What sits between domain knowledge and tangible information nuggets for decision making, well obviously the data, blended with domain knowledge in the form of business rules.

The trick is to know the data inside and out, and know the domain knowledge.  How is that information obtained?  By meeting with the business, asking them what they do.  Yes, I see that, can you explain this, and what do you mean when you say that, and how does this happen, what are the exceptions.  You won't find that information in some document or manual, yet that's the gold mine.

You see, that information is dissected and translated into rules.  Those rules get spliced into the flow of the data, in the Extract Transform and Load process, typically, you'd be surprised how much business rules are nestled in Excel files that call other workbooks with VLOOKUPS that call other VLOOKUPS 9 layers deep that reference a file on the accountants desktop, none of it documented or in source code repository.

Having solid data framework is not enough.  Having documented business processes to extract the domain knowledge is key.  In addition to data governance, data dictionary, data stewards, master data management, change management, documented code.

Documented code is also key.  Have you ever opened a stored procedure 6000 lines long, calling views that call views, with temp tables calling temp tables, with cursors that loop through tables, without a single line of comments to let the next developer know what's happening.  As you may be aware, SQL developers have upward trajectory and can find new jobs fairly quick, so you may be the 4th developer to touch the code, everyone else is gone, you have no idea where to begin.

When I started first coding job, the senior developer would sit at my desk, open the code and walk through line by line.  Picking out flaws, why are you doing this, when over here you did it like this, why are you trying to be fancy with the code, better to write a function in 25 lines that's easy to understand, than write 5 lines that nobody can interpret, and comment everything, and trap all errors and flow upstream, and make it easy for the next maintenance coder to maintain this thing.

Domain knowledge is underrated asset for internal employees, infrequently documented, littered throughout the code across various departments.  Getting to that information is the task of a data consultant.  And then translating to business rules, that get documented and applied in code.

If you can do that, you can solve the clients problems, by deciphering business rules into code to obtain nuggets of gold information, to make decisions and grow the business.  That's the way I see it.

And here's my dog Sammie, eating a shoe~!


Data Catalogs will Create Structure and Fortify your Org

We've seen the rise of the report writer, to the data modeling OLAP Cube developer, to the Self Service proliferation of data to the masses, to big data, to cloud data, and streaming data, and machine learning algorithmic data, to unstructured and semi-structured data and even graph data.

We've seen the rise of the Chief Data Officer, a well need position to fill the gap between who owns the data.  IT doesn't own the data, the business does, so we're told.  The CDO regulates the data, decides what vendor products to use, to determine what staffing is needed, to build in-house or purchase pre-canned software, what licensing, maintenance agreements, what systems to shut down because they're outdated, broken or cost hogs.  And lastly, the CDO documents the flow of data, from low level process to high level enterprise, into diagrams and data dictionaries.

In my opinion, Data Dictionaries have risen to the top of the heap, or at least they should be.  Because Data Governance should be a huge priority, right behind security, and because new laws are coming down the pike, to install data privacy.  Not only that, Data Dictionaries are the glue that bonds the entire organization.

You see, the data flows in and out of every department.  What makes the data folks valuable is they look at the entire picture, document it, and build data flows to extract and roll up ALL the data.  That insight into the deep bowels of the org are what provide the value.  Because they see all the tentacles, the nasty processes, sometimes manual, many times actually, and the business rules never documented or layered in some Excel or Access application 9 layers deep.

In order for a company to flourish, have competitive advantage, and to soon comply with data mandates, every organization needs to be cognizant of the concept of the Data Catalog.

A Data Catalog knows where all the data lives, in real time.  It knows where the duplicate fields are, it knows the data types, it knows where the data got created, perhaps how it flows through they system, with each hop along the way.  Some data dictionaries provide a glimpse, they're easy to spin up, not easy to maintain, but they do the job.

The more advanced Data Catalogs are perhaps Cloud based, have built in intelligence as in machine learning, to scour the internal data ecosystem regularly, to extract the data metadata in every corner of the business.  They can scan database logs to look for heavily used tables or queries or fields.  

Companies need structure, around the data, which encompasses the process and people and flow of business.  Data Catalog fill a hole in the data ecosystem to document the data, so IT and the Business know and understand the data, in order to extract knowledge, to gain insight, run the business, lower costs, and streamline the processes.  Documenting processes seams rather boring perhaps, that's why some vendors have gone a step further and automated chunks.  Because having the data updated automatically takes the heavy lifting off full time resources, so they can tend to other issues and tasks.  And by automating, it gets done regardless of roadblocks.

One last thing, having a Data Catalog helps to support the Data Governance team do their job.  Which is to structure the orgs data, dictate definitions of things, hierarchies, processes, when and how new elements are added or deprecated, to create a holistic, company wide roadmap of the data ecosystem.  As mentioned on this blog in the past, new laws are here from Europe, perhaps Canada and soon to be others, to make sure your org has structure in place, to keep customer data private and not fall into the hands of hackers.  And purge data when no longer needed.   And account for 3rd party handlers of data.

So getting a Data Catalog is not just a cool new concept, it should be baked into the heart of every organization to assist and steer the direction of the org, to tap into the pulse, make assessments on set frequency, so you can pivot on a dime, if the situation demands.

Data Catalog's structure your organizations data and unite departments.  Its the glue that bonds the org together.

Thanks for reading~!

Getting Started with SQL Server vNext Early Adoption Program Big Data Analytics Step 1

SQL Server 2019 is right around the corner.  Microsoft offers the ability to sign up for early adoption.  Simply go to this URL and register, if you are granted access, you'll receive an email such as this:

Here's a list of available option to experiment with:

Let's get started with Big Data Analytics.  

Can use this document to experiment with good example:

Configure Azure Kubernetes Service for SQL Server 2019 (preview) deployments

Logged on to Azure Portal using current Login:

It says to create a resource group using Bash:

Initiated a Bash shell window, it prompts to Create Storage:

So we've connected to Bash from within Azure Portal online window:

We're prompted to connect our Device and Login to Cross-Platform Command Line Interface in Azure:

Login to Azure Account:

az login az account list az account set --subscription az group create --name sqlbigdatagroup --location eastus az aks create --name kubcluster \
--resource-group sqlbigdatagroup \
--location eastus \
--generate-ssh-keys \
--node-vm-size Standard_A2 \
--node-count 3 \
--kubernetes-version 1.11.3

Proceed to enter commands in Bash, select our Subscription:

Set Account, then Created a Resource Group:

Next, attempt to Create a Kubernetes cluster:

We enter the command, it's Running...

Error.  The type of VM we selected exceeds our capacity:

Looking at the Pricing Structure in Azure, we need to select a different plan option:

Tried a different plan, error still...

So we pick one from the available list "Standard A1", which may be Previous Generation, not sure exactly:

And so, the Bash command completed, Success!  It generates a JSON which we copy and store in Notepad.  It generated quite a lot of Services / Resources:

Next, we connect to our newly created Cluster:

az aks get-credentials --resource-group=sqlbigdatagroup --name kubcluster kubectl get nodes

Enter the command in Bash:

Next command, get kubectl nodes:

So we've completed the first step.  

Next Steps...


Download the CLI 2.0 for Azure:

Follow along for Step 2 of the process...


Getting Started with SQL Server vNext Early Adoption Program Big Data Analytics Step 2

Now that we've completed Step 1 of Getting Started with SQL Server vNext Early Adoption Program Big Data Analytics, we're ready to push forward with Step 2.  So let's begin.

Quickstart: Deploy SQL Server big data cluster on Azure Kubernetes Service (AKS)


To install the command line tool, we go to this link.

Will install using PowerShell...

Install-Script -Name install-kubectl -Scope CurrentUser -Force
install-kubectl.ps1 'C:\Program Files\kubectl'

First, had to create an account with https://www.powershellgallery.com/

Initiated the download...

Installed in local computer:

Now, install Python on Client machine...



Installed Successfully!

Verify Connection to AKS...

Install Pip:  https://pip.pypa.io/en/stable/installing/

From Azure Bash window: kubectl config view

Now set the Environment Variables in DOS Command Prompt - (don't include quotes in Password):

Next we create our SQL Server Big Data Cluster within our existing Cluster:

Now Deploy a Big Data Cluster:

mssqlctl create cluster

mssqlctl create cluster sqlbigdataclusterjb

Executed command from both DOS and PowerShell ISE, both run as Administrator:

mssqlctl create cluster sql2019bigdata

Received an error at this point:

2018-11-19 18:52:49.0138 UTC | INFO | Creating cluster...
Traceback (most recent call last):
  File "c:\program files (x86)\python37-32\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\program files (x86)\python37-32\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Program Files (x86)\Python37-32\Scripts\mssqlctl.exe\__main__.py", line 9, in
  File "c:\program files (x86)\python37-32\lib\site-packages\mssqlctl\__init__.py", line 1313, in main
  File "c:\program files (x86)\python37-32\lib\site-packages\mssqlctl\__init__.py", line 804, in createCluster
  File "c:\program files (x86)\python37-32\lib\site-packages\mssqlctl\__init__.py", line 963, in checkAndSetKubectlContext
    current_context = check_output("kubectl config current-context".split()).decode("utf-8").strip(" \t\n\r")
  File "c:\program files (x86)\python37-32\lib\subprocess.py", line 389, in check_output
  File "c:\program files (x86)\python37-32\lib\subprocess.py", line 466, in run
    with Popen(*popenargs, **kwargs) as process:
  File "c:\program files (x86)\python37-32\lib\subprocess.py", line 769, in __init__
    restore_signals, start_new_session)
  File "c:\program files (x86)\python37-32\lib\subprocess.py", line 1172, in _execute_child
FileNotFoundError: [WinError 2] The system cannot find the file specified

And that's as far as I'm taking this for now.  The Python error has stopped in tracks, for now.

Thanks for reading~!

The Modern Data Ecosystem is the New Kid in Town

What better than to create something out of nothing.  As kids we learn to build with blocks.  Such a rudimentary game, yet the underlying principles are timeless.

You've got different shapes and sizes.  You can stack high, wide, whatever you want.  These building blocks are the underlying foundation for today's shift in technology.

ETL is a block.  So are source systems.  Staging databases too.  And Data Warehouses.  Self contained units of substance.  You can build each block traditionally, on-premise.  Or you can take pieces and apply the same block in the cloud.

Source system on-premise.  ETL to the Cloud, just lift and shift (yes another overused term today).  There are lots of connectors to lots of data sources run from on-premise or Cloud.  Just pick up the block and move it to the Cloud.  Store in Data Lake, use bits or pieces as needed.

From there, the Staging block fits nicely, many options to choose from, same with Data Warehouse.  Merge with other data sets.  Options are limitless.

Each stage along the trail is simply a block, and the way technology is evolving, it doesn't matter where it lives.  Technology has broken each piece into re-usable patterns with no predefined home.

A good example is SQL Server Managed Instance, basically took the high end version and ported to the Cloud.  And of course, there are other options that run as a service.  There is a solution for almost every scenario.  Or take bits and pieces and apply the to your unique environment.

But not only that.  That's the easy part.  All the other blocks play nicely with the other parts.  Its a smorgasborg of various components that adapt and connect with other pieces of technology.  The vendors are staying a step (many steps) ahead of the developers, with high level architecture in mind.  The new Modern Data Warehouse has links to traditional Data Warehouse, yet the MDW has grown exponentially.

It really is mind blowing when you think back to the days of single reporting tool, against a single database, with a single select statement, running against a production database, slowing down the entire application for the entire company, is someone running a large query or month end in production?.

Those who work in data, and carved out a niche on a single specific technology, will need to revamp their game.  DBAs, Query Tuning, Report Developers.  The world of data is in completely new territory, and there's no turning back.  The Modern Data Ecosystem is the new kid in town.


Getting Started with Azure Data Studio Extension SERVER DASHBOARD

Azure Data Studio is a new product which is similar to SQL Server Management Studio, yet way cooler.  You can manage traditional SQL Server Databases, write T-SQL Queries, as well as new features.

Azure Data Studio has a plethora of Add-In features.  

In this demo, going to outline a new feature: Server Dashboard for Azure Data Studio

You can find the tutorial steps here as well.

After opening Azure Data Studio, we click on "Extensions" button on the left hand side panel, select "Server Reports":

Next, we execute a statement to Alter our database set QUERY_STORE = ON:

I had to add permissions to the local SQL Database to execute the query, it did run:

Next, in Azure Data Studio, a cool new features select View from the dropdown list, select "Command Palette..." a window opens, a list of commands appear alphabetically, type in "SETTINGS", select "Preferences: Open User Settings":

Next, select Dashboard Database Widgets:

Next, follow the web tutorial instructions:

In Azure Data Studio, you will see 2 panes left and right.  The Left hand side is not editable (you can't eat it!), so what I had to do is copy the contents from Left hand pane, paste into the Right hand pane, splice carefully or the JSON will show red highlight that it's not formatted correctly:

When you close out, it will prompt to save, or click Ctrl+S :

Next, back in the Servers window, select the Server, right click, select "Manage":

A window opens to the right, containing our new widget "SERVER DASHBOARD":

Not much activity on this database, if there were, you can analyze deeper.

And that concludes our demo on Azure Data Studio Extension for Server Dashboard.  There are many other Extensions to download / install as well so this should keep us busy for a while.

Thanks for reading~!


Getting Started with Dynamic Data Masking in Azure SQL Database

Azure SQL Database provides Dynamic Data Masking out of the box. 

This blog reviews the commands required to set up a Dynamic Data Masking, and bundles in Row Level Security as well.

First, we logged into Azure Portal with valid credentials and subscription  Then selected Azure SQL Database:

Next, we created a new database.  The wizard prompted for information:

Next, we created a new Resource Group for this demo and created a new Server and Database:

Our newly created Azure SQL Database was provisioned:

We already had the new Azure Data Studio downloaded and installed on laptop.  It's a new tool similar to SQL Server Management Studio (SSMS).  

We connected to our newly created Azure SQL Database.   Azure SQL Database requires a setting in Azure SQL Database Firewall, so we need to add current IP Address to allow access from remote laptop:

Using traditional SQL, we ran some  scripts to test the Data Masking functionality, as follows:

------------------------------ -- DROP USERS IF THEY EXIST   DROP USER IF EXISTS TestUser;   DROP USER IF EXISTS TestUser2;   DROP USER IF exists TestUser3; -- DROP TABLE IF EXISTS   DROP table IF EXISTS dbo.Membership; ------------------------------
-- CREATE DATABASE TABLE FOR TESTING CREATE TABLE dbo.Membership    (MemberID int IDENTITY PRIMARY KEY   UserName varchar(100) NULL,    FirstName varchar(100)   NULL   LastName varchar(100)    NULL   Phone varchar(12)        NULL   Email varchar(100)       NULL,    SSNbr varchar(11)        NULL);  -------------------------------- -- INSERT 3 ROWS 1 FOR EACH USERNAME INSERT dbo.Membership (UserName,FirstName, LastName, Phone, Email, SSNbr) VALUES   ('TestUser2','Roberto', 'Tamburello', '555.123.4567', 'RTamburello@contoso.com','123-45-6789'),  ('TestUser3','Janice', 'Galvin', '555.123.4568', 'JGalvin@contoso.com.co','9876-54-321'),  ('TestUser', 'Juan','Mu', '555.123.4569', 'ZMu@contoso.net','555-44-3211');  ------------------------ -- CREATE A SCHEMA FOR SECURITY CREATE SCHEMA Security; ------------------------ -- CREATE A FUNCTION TO TEST ROW LEVEL SECURITY CREATE FUNCTION Security.fn_securitypredicate(@Username AS sysname    RETURNS TABLE  WITH SCHEMABINDING  AS      RETURN SELECT 1 AS fn_securitypredicate_result   WHERE @Username = USER_NAME() OR USER_NAME() = 'TestUser2'; --------------------- -- CREATE A SECURITY POLICY THAT REFERENCES ABOVE FUNCTION CREATE SECURITY POLICY FilterDemo ADD FILTER PREDICATE Security.fn_securitypredicate(UserName)   ON dbo.Membership  WITH (STATE = ON);  --------------------- --CREATE TEST USER 1 CREATE USER TestUser WITHOUT LOGINGRANT SELECT ON dbo.Membership TO TestUser; ----- --CREATE TEST USER 2 CREATE USER TestUser2 WITHOUT LOGINGRANT SELECT ON dbo.Membership TO TestUser2; ----- --CREATE TEST USER 3 CREATE USER TestUser3 WITHOUT LOGINGRANT SELECT ON dbo.Membership TO TestUser3; ------------------------ -- RUN SELECT STATEMENT AS USER 'TestUser' RETURNS 1 ROW -- AS ROW LEVEL SECURITY IS IMPLEMENTED EXECUTE AS USER = 'TestUser'SELECT USER_NAME()  as UserQuery,'Run as User=TestUser' as QueryType,User,* FROM dbo.Membership;  REVERT; ------------------------ -- RUN SELECT STATEMENT AS USER 'TestUser2' RETURNS 3 (ALL) ROW -- AS ROW LEVEL SECURITY IS IMPLEMENTED EXECUTE AS USER = 'TestUser2'SELECT USER_NAME()  as UserQuery,'Run as User=TestUser2' as QueryType,User,* FROM dbo.Membership;  REVERT; ------------------------ -- RUN SELECT STATEMENT AS USER 'TestUser3' RETURNS 1 ROW -- AS ROW LEVEL SECURITY IS IMPLEMENTED EXECUTE AS USER = 'TestUser3'SELECT USER_NAME()  as UserQuery,'Run as User=TestUser3' as QueryType,User,* FROM dbo.Membership;  REVERT; ------------------------
The first query returns 1 row, run as "TestUser".

The second query returns 3 rows, run as "TestUser2"

The third query returns 1 row, run as "TestUser3".

Reason being, we implemented a function, to handle Row Level Security.  Row Level Security is another Security feature out of the box in Azure SQL Database.  You can see the LastName and SSNbr fields are not masked, yet.

To get Dynamic Data Masking, we add the following code snippets:
ALTER TABLE dbo.Membership  ALTER COLUMN LastName varchar(100) MASKED WITH (FUNCTION = 'default()'); --- -- APPLY DATA MASKING TO SSNBR FIELD OF MEMBERSHIP TABLE ALTER TABLE dbo.Membership  ALTER COLUMN SSNbr varchar(100) MASKED WITH (FUNCTION = 'partial(0,"XXX-XX-",4)'); ------------------------ -- RE-RUN SELECT STATEMENT AS USER 'TestUser' RETURNS 1 ROW -- 2 FIELDS ARE MASKED AS EXPTECTEDEXECUTE AS USER = 'TestUser'SELECT USER_NAME()  as UserQuery,'Run as User=TestUser' as QueryType,User,* FROM dbo.Membership;  REVERT; ------------------------ -- RE-RUN SELECT STATEMENT AS USER 'TestUser2' RETURNS 3 (ALL) ROW -- 2 FIELDS ARE MASKED AS EXPTECTED EXECUTE AS USER = 'TestUser2'SELECT USER_NAME()  as UserQuery,'Run as User=TestUser2' as QueryType,User,* FROM dbo.Membership; 
REVERT; ------------------------ -- RE-RUN SELECT STATEMENT AS USER 'TestUser3' RETURNS 1 ROW -- 2 FIELDS ARE MASKED AS EXPTECTED EXECUTE AS USER = 'TestUser3'SELECT USER_NAME()  as UserQuery,'Run as User=TestUser3' as QueryType,User,* FROM dbo.Membership;  REVERT; ------------------------
Here's the output from all 3 queries. You can see the Dynamic Data Masking. Fields are LastName & SSNbr.

------------------------ -- TO GET A LIST OF FIELDS THAT HAVE DYNAMIC DATA MASKING, RUN THIS SCRIPT SELECT c.name, tbl.name as table_name, c.is_masked, c.masking_function  FROM sys.masked_columns ASJOIN sys.tables AS tbl       ON c.[object_id] = tbl.[object_id]  WHERE is_masked = 1; ------------------------
Source code for this demo was found here. Thanks for reading~!