What is NOT a Data Scientist Function

Data Science sprung up one day and set all us Report Writers and Data Warehouse people to the side.

Where is your PhD?  Where's your statistical analysis?  Where's your math skills?  Where's your Big Data skills?

Uh, I took some classes about 20 years ago.  I may have the book in my parent's basement, if they haven't thrown it out.

Data Scientist entered the arena and people took notice.

Fast forward a few years, where are we now in the Data Science hottest new profession?

Well, in my opinion, Data Science does have more to do with Statistical Analysis.  However, it must be mixed with Domain Knowledge, to produce insights otherwise unknown.

Big Data is NOT required for Data Science.  However, digging through Large Data Sets or Unstructured data could produce some nuggets of insight which were previously ignored due to lack of technology to interrogate.

Data Preparation is also NOT a requirement.  When I interviewed at Facebook a few years ago, the position was for Data Engineer, not Data Scientist.  The role of the Data Engineer I believe was to prep the data into a manageable file or set of files, for downstream statistical analysis. 

Data Warehousing and Report Writing is NOT a requirement.

Storytelling or Data Visualization or the ability to translate the finding to Executives using basic language IS part of the Data Science role.  If you can't convert the insight to action, what good is it.

A PhD is NOT a requirement either.  If you learned how the craft from an online course or looked over the shoulder of someone else, and can do the job, that's all you need.  And by doing the job, building models, dissecting the model using algorithms such as Clustering, Classification, Regression, Decision Trees, Neural Networks, etc. to produce accurate repeatable results.  It's rather difficult to pull a PhD out of your pocket after leaving college decade or two ago.

To summarize, in my opinion, interrogation of data using Statistical Analysis and presenting that insight to stakeholders who are authorized to take action are requirements to be a Data Scientist.  With that said, it may include Big Data or Unstructured data or Data Visualization tools and even data preparation.  Those tasks could be performed by someone on the Data Science Team or outsourced to individual people within the org.  Or a lone ranger Data Science may do parts or all, depending on resources and skill levels.

A Data Science person or team can provide skills and products previously unknown to the traditional Data Professional.  From where I stand, many shops today still depend on Data Professionals to do heavy lifting of their data needs.  And some shops are using Access and Excel or emailing files and downloading data sets off the intranet and lack formal Business Intelligence.

Data used to be an afterthought.  Now data is used for strategic decisions that impact everyday production, sales, marketing and survival of organizations.  Whatever the role: Data Scientist; Data Professional; Business Intelligence; Data Warehouse Developer; Report Writer; etc. data is now at the forefront of just about every business today.