Minitab Vs JMP
Minitab Vs JMP
Minitab Vs JMP
Software Review
Minitab 15
One of the "big beasts" of statistical computing capable of much more than basic use.
By Wayne Holland
I recently completed a software review for OR/MS Today of JMP 6.0.3, so I thought it would be interesting to write a comparative review for Minitab 15 a statistical package versus a statistical package. The focus of my review of JMP was on the user-friendliness of performing basic statistical analyses. I did not investigate the more advanced features because I was interested in considering whether JMP was a good option for the management science/operations research professional with either some data to get a handle on or to perform some basic statistical analysis on. I was mainly interested in how easy is it to get something meaningful out. This, therefore, sets the tone of my analysis of Minitab. The software review editor tells me that Minitab is almost synonymous with Six Sigma and is heavily favored by practitioners JMP is a relative newcomer to Six Sigma but Six Sigma is not a part of my review. I am sure readers are aware that Minitab is a long-established standard in the statistical analysis business and capable of much more than the basic use. My last encounter with Minitab was 20 years ago as an undergraduate working on analyzing Box-Jenkins forecasting problems on a mainframe computer. All those DOS-type commands one had to type, such as COPY C1 C2 and ARIMA (1,1), come back to me as a ghost from a pastlife when I used to be good at statistics! These days, desktop packages strive for menudriven smoothness, and I was pleased to see that the new Minitab is no exception. Minitab is one of the big three (along with SAS and SPSS) in the statistical computing business. My question is: Is Minitab best left to the heavyweight statistical user, or does it have something to offer everyone else as well? Installation
In some past software reviews for OR/MS Today (Holland, 2003, 2005) over the past few years I have encountered installation difficulties. I am pleased to report that Minitab installed simply and cleanly. I was in a position to start working within minutes.
First time usage. When Minitab is run, it opens to the screen shown in Figure 1. It uses a two-window system: one called Worksheet, rather like a spreadsheet, for holding the data, and the other called Session, to which output of analyses is added. Thus, at the end of the session, all of the output, except graphs and charts, will be contained in linear format in this window. I have never quite understood why, if the package works on the basis of "all output going to one report" that some bits of output don't go to that report. It seems inconsistent to me. But I guess that's just me being prissy and neat!
Figure 1: Initial screen of Minitab. The Session Window will display results; the Data Window will contain the data you wish to analyse.
Access to all graphical tools, such as histograms, scatterplots and 3-D surface plots, is via the single-menu item "Graph." Similarly, all statistical analyses are stored under the general menu item "Stat." Clicking on this opens up a sub-menu offering the following statistical analyses: basic statistics, regression, anova, DOE, control charts, quality tools, reliability/survival, multivariate, time series, tables, non-parametrics, EDA, power and sample size. It is a neat, logical arrangement to allow self-contained areas of statistics to be explored without needs to understand everything before being able to make sensible progress. Example data. To investigate using Minitab, I used exactly the same data file that I used for the JMP review. It is a data file I created for student coursework. The file is an Excel spreadsheet containing the Forbes Global 2000 companies as of Sept. 20, 2005. I took the data from www.forbes.com. Figure 2 shows the first 11 rows, showing the top 10 companies and the data collected by Forbes to produce the ranking. There are four
quantitative variables (sales, assets, profit and market value) and two categorical variables (country and industry sector/category). For the JMP review, I performed some exploratory data analysis, produced scatterplots and correlations, and then performed a multiple regression with validation and a hypothesis test that required the creation of new "flag" variable to separate out data stacked in a single column. Finally, I investigated 3dimensional plotting facilities. The intention is to repeat these exercises here and compare the ease of production and the quality of the final result.
Figure 2: Sample of data from the Excel worksheet imported into Minitab.
The Excel data file appeared to be read in very easily by Minitab. It is displayed in Figure 3. For columns read in as text, "-T" is appended to the column heading. This is a useful confirmation that the data has been read in correctly. However, when I started to attempt analyses involving the Category column, C4-T, error messages were produced saying there were unequal numbers of observations in each column. I scanned down the rows and all columns appeared to stop at row 2,000. However, it finally emerged that there was a stray entry 37 rows below the end of the data set. This was careless on my part, but I was somewhat annoyed that Minitab did not fill in rows 2,001 to 2,036 with "*" to indicate that it thought there were missing values in these rows. This is what Minitab does with missing values in a data set. The fact that Minitab did not fill in these rows indicated to me that it did not consider them part of the data set and hence there should have been no problem! Also, a column of "*" beyond row 2,000 would certainly have helped me flag up this issue in less time than I wasted on it.
It is reasonably intuitive to perform basic exploratory data analysis immediately, without reference to the documentation. I created bar charts, scatterplots and summary statistics to get a feel for the data. However, one feature I didn't like was in the production of a bar chart of mean sales and mean market value categorized by industry sector. What I wanted was the sectors listed across the horizontal axis, with two bars at each category to represent the relevant mean sales and mean market value. What I got was Figure 4, which is a bar chart of mean sales by category followed by a bar chart of mean market value. This required a two-stage process displayed in Figure 5 and Figure 6. It may very well be possible to produce the result I was looking for, but it is certainly not easy to find from the options offered, nor by reference to the user guide. [Editor's note: According to Jay Aubuchon, product manager at Minitab, choosing "Graph variables displayed innermost on scale" would produce the desired result in the dialog box shown in Figure 6.]
Figure 4: Bar chart of profits and market value averages by industry sector.
Figure 5: Creation of Figure 4 in Minitab. Having selected Stat ... Bar Chart, the statistic to be presented is selected and the type of presentation required.
Figure 6: Creation of Figure 4 in Minitab. Following on from Figure 5, a second form is displayed to select variables to be displayed and categorization variable.
At this point, I also came across another feature I didn't like: the lack of interactivity on graph manipulation. I was expecting to be able to grab axes and elongate or shrink them. However, they were entirely fixed. I could reduce the size of the box in which the chart was presented, but I couldn't enlarge it. I could make these changes by calling up the relevant menu items for re-scaling and typing in new values, but this seems very restrictive and old-fashioned in comparison with JMP. It is obviously a relic of Minitab's
heritage as a mainframe computer package, but this sort of issue should be dealt with in the transference to PC package. [Editor's note: According to Aubuchon, this is a consequence of Minitab's choice to edit graphs like Excel, and has nothing to do with heritage.] The production of scatter-plots and correlation matrix for the four quantitative variables sales, assets, profits and market value (Figure 7) also surprised me. Rather than offering me a default option of correlating all variables against all others, I had to fill in the table, shown in the center of Figure 8, identifying which pairs I required to view. This seems a rather cumbersome way to proceed. [Editor's note: According to Aubuchon, Graph > Matrix Plot would produce the desired result.] The required correlation matrix, with associated p-value below each correlation, was added to the Session window. The scatterplots were created in a separate chart window.
Figure 7: Scatterplots and correlation matrix for sales, profits, assets and market value. Note that the correlation coefficients and p-values of significance test are added to the session window, with the scatterplots displayed in a separate window.
Figure 8: Creation of Figure 7. Note that in the central table each pair of variables for which a scatterplot is required has to be entered. There appeared to be no default option for viewing all pairs.
Next I wanted to experiment with two specific statistical analyses: 1. A multiple regression explaining market value in terms of sales and assets and testing three of the assumptions of the linear model autocorrelation, normality of residuals and homoscedasticity of residuals. 2. A hypothesis test to investigate whether there was a significant difference between average U.S. and non-U.S. company profits. Multiple regression. There are various regression options easily accessible via the Stat ... Regression Menu, such as stepwise, partial least squares and various logistic regression methods. However, for illustration, I created directly the multiple regression model: Market Valuei = _ + _1 Salesi + _2 Assetsi + _i This is completed very intuitively and with little effort. The result is shown in Figure 9, which gives not only the model but also all the validation information required, such as test for normality of residuals, durbin-watson to test for autocorrelation and scatterplot of residuals against fitted for heteroscedasticity. This is a well-handled, strong aspect to Minitab, and better, in my view, than the two-stage process required in JMP.
Figure 9: Multiple regression output for Market Valuei = + 1 Salesi + 2 Assetsi + i Note that the model and all the usual validation tests are displayed in one analysis.
Hypothesis test. I was interested to see whether there was a difference in average profitability for U.S. firms compared to non-U.S. firms in the Forbes 2000 ranking. This required setting up a new column with a "flag" variable which contains either "Y" or "N" (or any bi-value pair) to represent "U.S. company" or not. This would allow the data in the Profit column to be divided into the two relevant data sets. This was fairly intuitive to achieve (it didn't require me to look in help anyway!). Via the Editor ... Formula ... Assign Formula to Column options, Figure 10 was produced showing a form to fill in to calculate the new column. The layout of this form makes it fairly obvious how to set up the necessary IF condition. Anyone who has ever used an IF statement in Excel will have no problems with this feature. The required analysis follows easily (Figure 11).
Figure 10: Creation of a conditional statement to set up the new column in C9.
Figure 11: Output for the hypothesis test on whether or not company mean profits are the same for US and non-US companies.
3-D plots. Finally, I was interested in creating a 3-dimensional plot. Maybe not the most essential example, but given the data I was working with, I decided to create a plot of market value as a function of assets and sales (Figure 12). This allows for direct comparison with JMP. The surface plotting tool is perfectly adequate, but it doesn't have the interactivity of JMP. You can right-click on an axis and get a form to adjust scale (Figure 13), or right-click on the graph and get a form that allows control of Graph
Attributes, Graph Size, Figure Location and Figure Attributes. In JMP, all this is done by click and point at the figure with the mouse. It doesn't materially make much difference to the final product, it's just more fun getting there!
Figure 12: Surface plot of market value against assets and sales.
Figure 13: The menu driven approach to re-scaling axes and graph size. It is not possible to do this interactively by the "click and drag" approach.
Advanced Features
Anyone familiar with using Minitab regularly will probably consider what I have written a travesty. Minitab is a much bigger, more sophisticated package than I have been able to cover in the above. It does much, much more with the same menu-driven approach, such as design of experiments, control charts, quality tools and forecasting. Minitab has been one of the three big players in statistical software for a very long time. Its historical standing is clearly a strength in already having a large, devoted following and a reputation for reliability. The danger comes from new products, such as JMP, built specifically for PC operating systems that are more able to exploit interactivity than a more mature package making a transition with considerable baggage. Quality of Documentation
Minitab comes with a very slim (approximately 150-page) "Meet Minitab" introductory guide. In essence it tells you what you need to know to get started. It is also supported by quite extensive help built into Minitab, including a good set of helpful tutorials to work through and a very nice "Methods and Formulas" page, which is essentially a "how to" of various statistical analytical methods (Figure 14). Personally, I prefer a little more paperbased documentation, but I fully concede I am probably in the minority in that respect these days.
Figure 14: Part of the in-built help facility in Minitab. Here are methods and formula for various statistical procedures. By clicking on a particular link, instructions are given about how to perform that analysis in Minitab.
Not having used Minitab since version 4 or 5 in my student days, I am not familiar with what is new from Minitab 14. However, the Minitab Web site claims: "Minitab 15 contains nearly 50 enhancements with minimal changes to the interface, making it simple for current users to access all the new features. Highlights include:
assign formulas to columns in the worksheet expanded gage R&R capabilities power (OC) curves for power and sample size probability distribution plot new reliability methods for forecasting future warranty claims"
Conclusion
Minitab remains with SPSS and SAS as one of the big beasts of statistical computing. The already committed user does not need to read anything from me to decide whether or not they will use it. For the new user, particularly with modest statistical needs, Minitab is certainly an accessible option and is not formidable in terms of inducting oneself into its use. However, if you are coming totally new to a statistical package, I would give serious consideration to JMP, which beats Minitab on interactivity, while matching functionality. But in the final analysis, it's a question of personal preference rather than killer knockout blow. It's your money; you make the choice!
Product Information
Minitab 15 is available from Minitab Inc. Address: Quality Plaza, 1829 Pine Hall Road, State College, PA 16801 Phone: 1-814-238-3280 Fax: 1.814.238.2035 E-mail: [email protected] Web site: www.minitab.com Pricing Professional version: Single perpetual use license: $1,195. Annual use licenses and volume discounts available. Visit http://minitab.com/products/pricing/ for details. Academic version: Students, professors and other qualified staff from educational institutions in eligible countries may
purchase and download unit copies and upgrades from eacademy, a leading provider of brand name software discounted for education. Options include 6- or 12-month rentals, as well as perpetual use unit copies of Minitab 15. Visit e-academy for more information. System requirements Operating Systems: Microsoft Windows 2000, XP or Vista RAM: 512 MB or more Processor: 1 GHz 32-bit or 64-bit processor Screen resolution: 1024 x 768 or higher Hard disk space: 125 MB (minimum) free space available PDF Reader: Acrobat Reader 5.0 or higher required for Meet Minitab
Wayne Holland is an associate professor (senior lecturer) in operations research at Cass Business School, City University, London, U.K. He teaches quantitative methods and management science to undergraduate, MBA and Executive MBA students. His research interests are in design and analysis of simulation models to investigate riskrelated issues, particularly operational risk in banking and supply chain risk.
References
1. 2. 3.
4.
Holland, W., 2003, "Software Review: @Risk Version 4.5 Pro," OR/MS Today, Vol. 30, No. 1, pp. 52-55. Holland, W., 2005, "Software Review: Crystal Ball v 7.0.1 Professional," OR/MS Today, Vol. 32, No. 2, pp. 54-57. Holland, W., 2007, "Software Review: JMP 6.0.3," OR/MS Today, February 2007 issue, Vol. 34, No. 1, pp. 66-72. "Meet Minitab 15," 2007, provided with software.
Table of Contents OR/MS Today Home Page OR/MS Today copyright 2007 by the Institute for Operations Research and the Management Sciences. All rights reserved. Lionheart Publishing, Inc. 506 Roswell Rd., Suite 220, Marietta, GA 30060 USA Phone: 770-431-0867 | Fax: 770-432-6969 E-mail: [email protected] URL: http://www.lionhrtpub.com Web Site Copyright 2007 by Lionheart Publishing, Inc. All rights reserved.