What do I like about the Microsoft BI Stack?
It's a complete ecosystem.
Data can be stored, processed and given to customers for a complete end to end solution.
Data can be stored in a variety of SQL Server databases going back to 2000 as well as on Azure. There's the blob storage for unstructured binary and text data. There's also MS Access if need be.
Hadoop in the Cloud
Big Data solutions can be found in HDInsight Azure on the web. They replaced HDFS with Blob storage. This can be access using PowerShell commands or .net programming language. You spin up your clusters for processing, get charged only for uptime, then spin them back down. The storage expands dynamically depending on size of the data. Typically used for data born in the cloud.
Hadoop On Premise
You can access Hadoop on premise with Hortonworks, offering both Windows and Linux versions. Here you can expose your processed data, typically aggregated, in HIVE tables. You can ETL data with PIG. And ingest data from SQL Server using SQOOP. You can stream data in as well. You can brush the data up against algorithms for Machine Learning Mahout or R statistical language.
Extract Transform and Load or Extract Load and Transform
Data can be moved around using SSIS, reading in files, other source databases, the jobs can be scheduled with error logs, audit trails and centralizing the business rules.
Data can be cleansed using Data Quality Services with DQS and Fuzzy components available in SSIS. Master versions of data can be stored Master Data Services. Developers can access that data through Views. Changes can be tracked using Versions. And you can apply Hierarchy's as well as business rules to prevent erroneous data entered.
SQL Server Reporting Services is a great product for extracting pre canned reports from SQL Server, Analysis Services (MDX) or a variety of other Data Sources including any ODBC. SSRS is available through an internet browser exposed on the web or intranet or can be tightly integrated with SharePoint. It has nice integration with Active Directory. And allows multiple data sources in a single report or using sub-report with drill through capability. You can create subscriptions so the reports run automatically off peak hours and can email reports to users email or place a file on a network share. Really a nice tool.
Through SharePoint, you can create stunning visualizations for web user consumption. With Scorecards and KPIs, drill through capability, Decomposition Trees and Active Directory level permissions.
Self Service BI
You can find Power View in SharePoint or Excel, which allows users to drag and drop measures and dimension fields based on Models to create visualizations to search for patterns and insight.
Power BI allows users to query data on the web, by viewing Excel files, which can be refreshed in the Cloud. You can even pull data from On Premise data sources with secure connection which is quite nice. Excel reports can be viewed, downloaded or collaborated with other users.
Q&A - Natural Query Language
Users can simply type in words and the report will dynamically change data content, layout with simple English.
Imagine a user connecting to any data source at any time with a few simple mouse clicks. Once the data is pulled in, it can be transformed, built into a Model for immediate consumption. You can merge data with other sources, there really are no limits.
This allows end users to mash data into Models, which can be used in Power View, Pivot Tables and can then be pushed to Tabular Model cubes for Enterprise use.
A fairly new feature which allows users to plot data points to map locations across the globe.
So as you can see, Microsoft offers a plethora of options when talking about data. And there new mantra, which I agree with, is a Data Centric world, where every department of every company can mash data to find insights to reduce costs, increase sales and streamline processes.
And there you have it. Microsoft Data Solutions in a quick blog.
Hope you enjoyed~!
I signed up for the Hortonworks Certified Associate exam last Thursday. Figured if I sign up, I'd have to take the test. And if I tak...
This blog post is in no way an attempt to steal other people's work. It's basically an conglomeration of notes from research I did...
Saw a post today on Twitter, " Microsoft releases CNTK, its open source deep learning toolkit, on GitHub " This is big news. Be...