Data Mining with Excel Q&A

Thank you to everyone who attended my free webinar on August 29, 2013 where I talked about using the Data Mining Add-in for Excel. If you were unable to attend you can find the recording here. You can also check out all the Pragmatic Works Free Training videos here. I discussed why data mining is useful to organizations, showed how to download and install the plug-in using Excel 2013 and then how to do a few basic things in each of the two tabs added by the installation.
Below are the questions I received during the webinar along with answers to each of them. If you have more questions please let me know and I’d be more than happy to post the answers here along with the others.
When selecting variables, the default has them all selected. Is there any way to deselect them all rather than having to do them one by one?
From everything I have seen the “Analyze” tab we looked at does not have a select/deselect all option. This option is however on the “Data Mining” tab. In the screenshot below on the left is Detect Categories from the Analyze tab without the option and the right is the Forecast Wizard from the Data Mining Tab with the option.
DetectCategories
For the download, from www.sqlserverdatamining.com, do I download DataMiningViewerClient to get started?
You actually want to choose the data mining add-in. The download links can be found on the right side of the page.
SSDMDownload
Can the data mining plug-in for Excel 2010 work with SSAS 2012?
Yes, the Excel 2010 plug-in will allow you to use an SSAS instance that is running 2008 or later. The Excel 2013 plug-in also uses SSAS 2008 or later. Just a side note: I have not tired using the Excel 2013 add-in against a non SQL 2012 SSAS server. The add-in wizard says it requires SQL 2008 or later, but the MSDN page says SQL 2012. I would need to setup a new environment to test other configurations, but I would assume the 2008 requirement is correct.
When we use the “Clean Data” option in Excel does it update the data in database?
No. When doing data mining in Excel the data is stored inside the workbook and isolated from the source. Any data changes only affect the table within the Excel workbook.
Does data mining support tabular models in addition to multidimensional models?
Data mining is currently only supported on an SSAS server running in multidimensional mode. In fact when installing SQL Server 2012 the choices are either “Tabular Mode” or “Multidimensional and Data Mining Mode”.
SSASServerMode
When will data mining on tabular models be available?
At this time there has been no announcement on when data mining will be available on tabular models.
How does the replace outliers with mean value option in the data cleanup cleanup menu work, does the mean include or exclude the outlier values?
Included. When changing my first two values the mean value for the entire data set is 56806. The mean value excluding those two records is 56117.22. The replaced value shown in the far right column of the screenshot is 56806.
ReplaceOutliers
What does the “outliers” result look like on the unmodified data set?
OutliersUnmodifiedDataset
How do I get the sample data you used?
First open Excel and navigate to the Data Mining tab, click the Help icon on the far right side of the ribbon and select the Getting Started option.
GettingStarted
On the first screen of the wizard select the option Use an existing instance of Microsoft SQL Server 2008 (or later) Analysis Services that I administer.
GettingStarted1
On the next screen click the Finish button.
GettingStarted2
 
Finally ensure the Open an example test table upon closing the wizard option is checked and click close in the bottom left. This will close the wizard and launch the sample workbook.
GettingStarted3
 
 

About the author

Bradley Schacht

Bradley Schacht is a Principal Program Manager on the Microsoft Fabric product team based in Jacksonville, FL. Bradley is a former consultant, trainer, and has authored 5 SQL Server and Power BI books, most recently the Microsoft Power BI Quick Start Guide. As a member of the Microsoft Fabric product team, Bradley works directly with customers to solve some of their most complex data problems and helps shape the future of Microsoft Fabric. He frequently presents at community events around the country, is a contributor to sites such as SQLServerCentral.com, and is a member of the Jacksonville SQL Server User Group (JSSUG).

Add comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Bradley Schacht

Bradley Schacht is a Principal Program Manager on the Microsoft Fabric product team based in Jacksonville, FL. Bradley is a former consultant, trainer, and has authored 5 SQL Server and Power BI books, most recently the Microsoft Power BI Quick Start Guide. As a member of the Microsoft Fabric product team, Bradley works directly with customers to solve some of their most complex data problems and helps shape the future of Microsoft Fabric. He frequently presents at community events around the country, is a contributor to sites such as SQLServerCentral.com, and is a member of the Jacksonville SQL Server User Group (JSSUG).

Follow Me