8/12/2013

Data Scientist Job Description Encompasses Everything

To become a Data Scientist, you must be all encompassing.

In that you must know the industry you're working in.

As well as all the data sources in the organization which you work.

And third part data, such as Facebook, Twitter, etc.

Then you must ensure the data is cleansed property.

And then mash the data together in ETL Extract Transform and Load.

You probably want to ingest all this data into Hadoop.

Then you should probably know Map / Reduce, SQOOP, HIVE, PIG, HDFS, etc.

And then be able to apply Statistical analysis using tools such as R programming languages.

And then you must know how to analyze the data to derive insights.

As well as send this data to the user in nice looking visualizations, cubes, dashboards, reports.

Of course they must create Models to predict consumer behavior, must be able to read machine generated logs and / or manufacturing sensors, streaming data,  as well as having the computer learn as it goes by creating Neural Networks.

Lastly, you must be able to communicate this data to information in plain English to the senior execs.

Plus have a PhD.

Simply enough.

It's equivalent to playing all 9 positions of a baseball team, being the manager, collecting the tickets before the game, selling the hot dogs and popcorn, and then getting all the gear to and from the game, driving the bus, and cleaning the uniforms.

Simply put, whoever created the job description for a true Data Scientist, how could they expect a single person to tackle every point along the trail of processing data.

It's a daunting task to say the least and a bit intimidating for someone entering the field.

Suffice to say, a Data Scientist job description encompasses everything.

1 comment:

Post a Comment

Note: Only a member of this blog may post a comment.