How to choose the correct statistical test.
> Instructions
> Literature referred to in the table
> Other overviews of statistical tests
| Goal of analysis | Research question | Paired/ unpaired (dependent/ independent) | Uni-/ Bi-/ Multivariate | # DV | # IV | Type of DV | Type of IV | # CV | Type of CV | Parametric/ Non-parametric | Name of method | Stata command | Further assumptions | Interpretation (95% level) | Internet | Literature |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| One-sample tests | Compare one group to a hypothetical value | - | Univariate | 1 | 0 | Dichotomous | - | 0 | - | Non-parametric | Binomial test | bitest varname == #p | - | p<.05: Probability is significantly different from #p. | Link | Sheskin (2004), p. 245; Conover (1999), p. 124; Siegel (1956), p. 36 |
| Nominal+ | Non-parametric | Chi-square goodness-of-fit test | csgof varname, expperc(expected percentages) (*) | - | p<.05: Observed percentages are significantly different from expected percentages | Link | Sheskin (2004), p. 219; Conover (1999), p. 241; Siegel (1956), p. 42 | |||||||||
| Ordinal+ | Non-parametric | Wilcoxon signed-rank test (Wilcoxon T) | signrank varname = # | - | p<.05: Median is significantly different from # | Link | Sheskin (2004), p. 189; Conover (1999), p. 241 | |||||||||
| Interval | Parametric | One-sample t test | ttest varname = # | - | p<.05: Mean is significantly different from # | Link | Sheskin (2004), p. 135 | |||||||||
| Significance of group differences | Compare two unpaired groups | Unpaired | Univariate | 1 | 1 | Nominal+ | Dichotomous | 0 | - | Non-parametric | Chi-square test of independence | tabulate varname1 varname2, chi2 | r*c table; expected cell frequency is >4 | p<.05: Variables are not independent | Link | Sheskin (2004), p. 493; Conover (1999), p. 180; Siegel (1956), p. 104 |
| Nominal+ | Non-parametric | Fisher's exact test | tabulate varname1 varname2, exact | r*c table; cell frequency can be <5 | p<.05: Variables are not independent | Link | Sheskin (2004), p. 505; Conover (1999), p. 188; Siegel (1956), p. 96 | |||||||||
| Ordinal+ | Non-parametric | Wilcoxon Mann-Whitney U test (Wilcoxon rank sum test) |
ranksum varname, by(groupvar) | - | p<.05: Medians are significantly different | Link | Sheskin (2004), p. 423; Conover (1999), p. 272; Siegel (1956), p. 116 | |||||||||
| Interval | Parametric | Unpaired samples t test | ttest varname, by(groupvar) unequal OR ttest varname1 == varname2, unpaired unequal |
Homoscedasticity | p<.05: Means are significantly different | Link | Sheskin (2004), p. 375 | |||||||||
| Compare three or more unpaired groups | Unpaired | Univariate | 1 | 1 | Nominal+ | Nominal | 0 | - | Non-parametric | Chi-square test of independence | tabulate varname1 varname2, chi2 | r*c table; expected cell frequency is >4 | p<.05: Variables are not independent | Link | Sheskin (2004), p. 493; Conover (1999), p. 180; Siegel (1956), p. 104 | |
| Ordinal+ | Non-parametric | Kruskal-Wallis H test | kwallis varname, by(groupvar) | - | p<.05: At least two of the sample medians are significantly different | Link | Sheskin (2004), p. 757; Conover (1999), p. 288; Siegel (1956), p. 184 | |||||||||
| Interval | Parametric | One-way ANOVA | oneway response_var factor_var | Homoscedasticity | p<.05: At least two of the sample means are significantly different | Link | Sheskin (2004), p. 667 | |||||||||
| >1 | Interval | Parametric | Factorial ANOVA | anova varname varlist | Homoscedasticity | p<.05: At least two of the sample means are significantly different | Link | Sheskin (2004), p. 887 | ||||||||
| Compare two paired groups | Paired | Univariate | 1 | 1 | Dichotomous | Dichotomous | 0 | - | Non-parametric | McNemar test | mcc varname1 varname2 | 2*2 table; number of cases on the diagonal at least 10 | p<.05: Difference between samples is significant | Link | Sheskin (2004), p. 633; Conover (1999), p. 166; Siegel (1956), p. 63 | |
| Ordinal+ | Non-parametric | Sign test | signtest varname1 = varname2 | - | p<.05: Medians are significantly different | Link | Sheskin (2004), p. 621; Conover (1999), p. 157; Siegel (1956), p. 68 | |||||||||
| Ordinal+ | Non-parametric | Wilcoxon signed-rank test (Wilcoxon T) | signrank varname1 = varname2 | - | p<.05: Medians are significantly different | Link | Sheskin (2004), p. 609; Conover (1999), p. 352; Siegel (1956), p. 75 | |||||||||
| Interval | Parametric | Paired samples t test | ttest varname1 == varname2 | - | p<.05: Means are significantly different | Link | Sheskin (2004), p. 575 | |||||||||
| Compare three or more paired groups | Paired | Univariate | 1 | 1 | Dichotomous | Nominal | 0 | - | Non-parametric | Cochran's Q test | cochran varlist (*) | - | p<.05: Difference of the proportion of subjects having low (or high) values on a set of dichotomous items is significant across items | Link | Sheskin (2004), p. 867; Conover (1999), p. 251; Siegel (1956), p. 161 | |
| Ordinal+ | Non-parametric | Friedman two-way analysis of variance | friedman varlist (*) | - | p<.05: Groups differ significantly on the criterion variable | Link | Sheskin (2004), p. 845; Conover (1999), p. 369; Siegel (1956), p. 166 | |||||||||
| Interval | Parametric | One-way repeated-measures ANOVA | anova varname1 varname2, repeated(varlist) | Homoscedasticity | p<.05: At least two of the sample means are significantly different | Link | Sheskin (2004), p. 797 | |||||||||
| >1 | Interval | Parametric | Factorial repeated-measures ANOVA | anova varname varlist, repeated(varlist) | Homoscedasticity | p<.05: At least two of the sample means are significantly different | Link | Sheskin (2004), p. 927 | ||||||||
| Degree of relationship | Quantify association between two variables | - | Bivariate | 2 | 0 | Ordinal+ | - | 0 | - | Non-parametric | Kendall rank correlation | ktau varlist | Square tables (tau-a); non-square tables (tau-b) | p<.05: Correlation coefficient is significantly different from zero | Link | Sheskin (2004), p. 845; Conover (1999), p. 321; Siegel (1956), p. 213 |
| Ordinal+ | Non-parametric | Spearman rank correlation | spearman varlist | - | p<.05: Correlation coefficient is significantly different from zero | Link | Sheskin (2004), p. 1061; Conover (1999), p. 316; Siegel (1956), p. 202 | |||||||||
| Interval | Parametric | Pearson correlation | correlate varlist OR pwcorr varlist |
Linear relations; homoscedasticity | p<.05: Correlation coefficient is significantly different from zero | Link | Sheskin (2004), p. 945 | |||||||||
| Interval | 1+ | Interval | Parametric | Partial correlation | pcorr varname1 varlist | Linear relations; homoscedasticity | p<.05: Correlation coefficient is significantly different from zero | Link | Sheskin (2004), p. 1000 |
| (*) Not part of Stata 11. Can be searched with: findit command. |
Last update: September 28, 2009. For errors and suggestions please write to tobias.pfaff@uni-muenster.de.
I would also like to add a column with R commands for the tests. If you have suggestions here, please email me also.
© Tobias Pfaff, 2009
Institute for Economic Education, University of Münster