Statistical significance testing

Undertake statistical analysis of numeric data, grouped by a category variable. Statistical tests include t-test, one-way analysis of variance (ANOVA), Wilcoxon rank sum test and Kruskal-Wallis test. Choice of test depends on user input (parametric or non-parametric) and the number of groups identified in the data.

# Statistical analysis of numeric data

Undertake statistical analysis of numeric data, grouped by a category variable. Statistical tests include t-test, one-way analysis of variance (ANOVA), Wilcoxon rank sum test and Kruskal-Wallis test. Choice of test depends on user input (parametric or non-parametric) and the number of groups identified in the data.

Inputs are:

- - the type of analysis to be done (parametric or non-parametric);
- - whether the data is continuous or discrete;
- - desired level of confidence for confidence interval estimation;
- - the desired number of digits after the decimal point in summary results; and
- - two columns of data (including a header row with labels for each column). The first column is the category variable and the second is the numeric variable.

**Note** missing values in the data will be omitted from calculations.

The program undertakes statistical tests depending on the options selected and also outputs a tabular summary and graphs of the data.

Outputs include:

- - a summary of statistical test results for: t-test (2 groups) or ANOVA (>2 groups) if the parametric option is selected or Wilcoxon rank sum test (2 groups) or Kruskal-Wallis test (>2 groups) for the non-parametric option;
- - Shapiro-Wilk test for normality on all groups and Bartlett's test for homogeneity of variance if the parametric option is selected;
- -
**Note:**for groups with < 4 or > 5000 observations it is not possible to calculate Shapiro-Wilks W statistic to test for normality of distribution; - - a numeric summary of the data for each group and overall, including histograms and normal plots;
- -
**Note:**for groups with only one observation or with standard deviation = 0 histograms and quantile plots are not shown; - - a table and bar chart of frequency counts for each cell if the data is identified as being discrete; and
- - a bar chart of the sample size per group, boxplots of the data by group and a plot of confidence intervals about the mean value for each group.