I work primarily in the #Microsoft space, which provides a full arsenal of tools.
Starting with SQL-Server database. There I can create databases on servers, assign permissions, create tables, views, stored procedures, indexes, tune queries, allocate memory with the intent of storing data.
Next I use SQL Server Integration Services (SSIS) to move data around, apply business rules and transformations which can be scheduled to run packages throughout the day.
I also create Data Warehouses, which moves the data form the OLTP transactions system to a pre defined methodology of Dim and Fact tables which streamline the data for reporting and historical purposes.
From there, the data can be pushed to a Cube in SSAS, or and In-Memory Tabular Model Cube where you denormalize the data for even faster data access.
These OLAP cubes are exposed for viewing through a language called MDX, which can be placed in SSRS reports, Power View, Performance Point and even Excel.
SSRS allows for reporting against OLTP and OLAP systems, which are developed locally and deployed onto the Server, where you can grant permissions and roles for security as well as scheduling report to run automatically throughout the day.
Next there's Big Data. Microsoft has partnered with #Hortonworks to provide Big Data offerings in the Cloud called HDInsight. There you can provision an account, store data in the BlobStorage which is very similar to HDFS and spin up your processing to query the data. It's designed to store data that's born in the cloud. You can create your scripts locally with their on-premise development utility, then once the job is perfected, you start up your cluster and run the job, then shut down the cluster as you are charged for 'up-time' separately from blob storage charges. The neat thing about the MS offering is you can run jobs in c#, Node.JS and PowerShell scripts. The HDInsight is a fully licensed and compliant Apache.org version.
I also work with Self Service BI, which puts the power of leveraging the data into the hands of the users. There it removes the barrier of IT and exposes the data to end users who can pull in data, assuming they have the correct privileges, they can transform the data, mash it up with Power Query for data sets indexed on the Web, they can create Models in Power Pivot, run dashboards in Power View, they can mine the data with the Data Mining plug in, they can Map the data in the interactive plug-in and lastly, they can upload the data to Office 365 with the capability of sharing the document with collaborators as well as schedule data refreshes in the Cloud. All this can be done in Excel, which is the spreadsheet most widely used in the business world.
So as you can see, we have three options from Microsoft for working with data.
- Big Data
- Self Service
Also, each topic listed above goes down a deep rabbit hole and the features and options within each subject are wide. I don't know of anyone who is an expert in every facet of BI.
So there you have it, in a nutshell. Being a Data professional requires a lot of learning, a lot of knowing, and a lot of mental strength to keep up and earn a living in such a vast, diverse and changing field.
I couldn't have chosen a more challenging and satisfying career!