What is the decomposition of roles for a Data Scientist?
Well, you have to know how to prepare data.
And domain knowledge.
Which is the most important? Well, you have to know them all.
For data, you have to know relational file structures, SQL Language, Big Data and NoSQL perhaps.
For programming, you have to know data cleansing, file parsing, string manipulation, merging data sets, scraping data from the web.
For domain knowledge, you have to know the industry, the norms as well as the caveats, processes, key decision makers, stakeholders.
For statistics, you have to know high level algorithms, how and when to apply each one, how to interpret them, look for outliers, sample sets, standard deviations.
And you must be a storyteller, a way to easily describe your findings in easily digestible formats. You should know how to document your solutions, the methods in which you prepared the data, what formula's applied and results in concise document for distribution.
And you may be required to automate your findings into a working solution such as exposed web service so others may query your model and receive statistical probability of possible outcomes based on parameters sent.
So who has all these skills? Data, Programming, Industry experience, MBA process flow, Statistics, Project Management, Documentation and Public Speaker. Even if you have all these skills, I bet one or more technologies have changed since you began reading this blog.
Perhaps Data Science is a team sport.
I signed up for the Hortonworks Certified Associate exam last Thursday. Figured if I sign up, I'd have to take the test. And if I tak...
Data becomes information. Information adds value if used properly to align business practices, streamline processes with net result of incr...
Data is the new oil. Sort of a good analogy. Except new oil is constantly required. And there is only so many oil wells on the planet. A...
What do you want to do when you grow up. For some of us, we still haven't decided. After close to 50 years. Chances are, if you chos...