
Since almost all real data sets have at least a few missing data points, and since the ability to deal with missing data correctly is one of the features that we take for granted in a statistical analysis package, we introduced two empty cells in the data:Įach row of the spreadsheet represents a subject. It was chosen to have two categorical and two continuous variables, so that we could test a variety of basic statistical techniques. To present the results, we will use a small example. We decided to do some testing to see how well Excel would serve as a Data Analysis application. As a result, if you suddenly find you need to do some statistical analysis, you may turn to it as the obvious choice. It is easily used to do a variety of calculations, includes a collection of statistical functions, and a Data Analysis ToolPak. Newly purchased computers often arrive with Excel already loaded. IntroductionĮxcel is probably the most commonly used spreadsheet for PCs. However when you are ready to do the statistical analysis, we recommend the use of a statistical package such as SAS, SPSS, Stata, Systat or Minitab.

Missing values are handled inconsistently, and sometimes incorrectly.


The problems we encountered that led to this conclusion are in four general areas: We concluded that Excel is a poor choice for statistical analysis beyond textbook examples, the simplest descriptive statistics, or for more than a very few columns. We used Excel to do some basic data analysis tasks to see whether it is a reasonable alternative to using a statistical package for the same tasks. University of Massachusetts School of Public Health
