11/27/2018

Domain Knowledge is Key to Successful Data Projects

Data is a primary factor in decision making.  All shapes and sizes and formats.  Stand alone data is good, but its only part of the story.

What is domain knowledge?  Its how the business operates.  Perhaps industry specific, or company specific, or department specific.  All the rules that exist, including product information, customer information, market knowledge, legacy information, all under the domain knowledge umbrella.

What sits between domain knowledge and tangible information nuggets for decision making, well obviously the data, blended with domain knowledge in the form of business rules.

The trick is to know the data inside and out, and know the domain knowledge.  How is that information obtained?  By meeting with the business, asking them what they do.  Yes, I see that, can you explain this, and what do you mean when you say that, and how does this happen, what are the exceptions.  You won't find that information in some document or manual, yet that's the gold mine.

You see, that information is dissected and translated into rules.  Those rules get spliced into the flow of the data, in the Extract Transform and Load process, typically, you'd be surprised how much business rules are nestled in Excel files that call other workbooks with VLOOKUPS that call other VLOOKUPS 9 layers deep that reference a file on the accountants desktop, none of it documented or in source code repository.

Having solid data framework is not enough.  Having documented business processes to extract the domain knowledge is key.  In addition to data governance, data dictionary, data stewards, master data management, change management, documented code.

Documented code is also key.  Have you ever opened a stored procedure 6000 lines long, calling views that call views, with temp tables calling temp tables, with cursors that loop through tables, without a single line of comments to let the next developer know what's happening.  As you may be aware, SQL developers have upward trajectory and can find new jobs fairly quick, so you may be the 4th developer to touch the code, everyone else is gone, you have no idea where to begin.

When I started first coding job, the senior developer would sit at my desk, open the code and walk through line by line.  Picking out flaws, why are you doing this, when over here you did it like this, why are you trying to be fancy with the code, better to write a function in 25 lines that's easy to understand, than write 5 lines that nobody can interpret, and comment everything, and trap all errors and flow upstream, and make it easy for the next maintenance coder to maintain this thing.

Domain knowledge is underrated asset for internal employees, infrequently documented, littered throughout the code across various departments.  Getting to that information is the task of a data consultant.  And then translating to business rules, that get documented and applied in code.

If you can do that, you can solve the clients problems, by deciphering business rules into code to obtain nuggets of gold information, to make decisions and grow the business.  That's the way I see it.

And here's my dog Sammie, eating a shoe~!