The past few days I've been resurrecting the project I worked on 4th quarter 2012.
The project was to determine ROI.
However, in order to do so, we must have an audit trail from initial click on Google ad-word to money deposited in the bank.
And due to the current process, it's possible for the chain to get broken along the way.
As we pass off leads to resellers and distributors.
So my code looks for matches that otherwise would be unmatched.
And last year I figured we could match only 32%.
And the project fizzled out.
However, this time around, I'm looking at a specific subset of products.
And the number shot up to 47%.
However, breaking it out by year the number climbed to 63%.
So I had a meeting with out business analyst to have her verify my data.
I output all the records about 70k of them, placed in Excel spreadsheet and sent it over.
She will log onto SalesForce.com and validate the data with Accounts, Opportunities and Leads.
And if that's good, then we'll have a master data set in which to build on.
That will be awesome. The logic to gather this data resides in about 25 t-sql files.
I'd like to port the process over to SSIS to automate at some point.
This was my first Data Scientist project.
And so it goes!