12/04/2012

Challenges of Working with Data

While working with data can be fun as well as reveal nuggets of insight, it ain't always easy.

For me, pulling data across multiple databases has really caused some inefficient queries.

They are slow and you really have to think like a chess player before starting an hour long process, think of all the downstream affects.

And there's no AAA map to help guide you from data sets to data sets, you're basically on your own.

That goes for both business rules as well as data structures.

Because the people that could offer assistance just don't have the time to help you out.  They've got their full time job to take care of.

And the thing you're trying to do, sometimes they don't tell you until you've got something to show them, then that gives them an idea of potentials and they change the specs.

What I'm talking about here is not your traditional reporting, nor is it common business intelligence practice.

It's basically data science work minus the big data.

It has tons of data, just not related data.

And the goal I've been trying to accomplish is to mash data sets that don't typically mesh.

And in order to do that, I've had to use fuzzy logic.

I've been working on this project for over a month now and I'm zoning in on some results.

I gave my boss an overview today and he said to narrow down the scope.

I was trying to figure out the entire project, account for every record in the database, which I just about figured out.

He wants a subset of the data, so we can QA it, and that's it.

So that's okay, I should have a sample file shortly as well as counts and percentages of the data he wants.

This has been a tough project.

Except if we can find insight, it will be well worth the effort.

Because I had to overcome lots of challenges along the way.

As I assume most data people do as well.

And there you have it!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Thoughts to Ponder