Crimes of Statistics

Crimes of Statistics: Power

To consider the POWER of your statistical analysis, we need to take a step back and talk briefly about Hypothesis tests and their relationship with POWER.

Remember how you start your research? With a hypothesis. For our little example we will have an hypothesis statement that says the mean height of cats is equal to the mean height of dogs. The alternate hypothesis would then say that the mean height of cats is not equal to the mean height of dogs.

Ho: µcats = µdogs
Ha: µcats ≠ µdogs

We are using an alpha value of 5%, therefore our p-value = 0.05. We went out to measure 4 cats and 4 dogs and their height measurements (inches) are:
Cats: 11, 13, 11, 14
Dogs: 24, 21, 18, 28

The mean height for cats is 12.5 with a standard deviation of 1.5
The mean height for dogs is 22.8 with a standard deviation of 4.3

I can conduct a t-test and it provides me with a p-value of 0.02. With data such as this I can also calculate the variation around the mean, such that I have 11.0-14.0 (12.5 ± 1.5) for the cats and 18.5-27.1 (22.8 ± 4.3) for the dogs. Do the ranges overlap? No.

What conclusion do we draw?
That we will reject the Null hypothesis and state that dogs are significantly taller than cats by an average of 10″.

Sounds great right? We did expect that the dogs would be taller than cats. So right from the beginning, in this example, our experience and knowledge of cats and dogs, told us that the Null hypothesis was false – and with our little sample we proved it!

Let’s review this table – in our case we were working with a Ho that we knew to be false and we rejected the Ho – so we have NO ERROR.

	H_o is TRUE	H_o is FALSE
REJECT the NULL Hypothesis	Type I error (ALPHA)	No error (POWER = 1-BETA)
ACCEPT the NULL Hypothesis	No error (1-ALPHA)	Type II error (BETA)

We’re going to repeat this experiment and measure another 8 animals – 4 cats and 4 dogs.

Ho: µcats = µdogs
Ha: µcats ≠ µdogs

We are again using an alpha value of 5%, therefore our p-value = 0.05. We have height measurements (inches) of 4 cats and 4 dogs:
Cats: 21, 13, 11, 14
Dogs: 23, 21, 18, 14

The mean height for cats is 14.8 with a standard deviation of 4.3
The mean height for dogs is 19.0 with a standard deviation of 3.9

I can conduct a t-test and it provides me with a p-value of 0.19. With data such as this I can calculate the variation around the mean, such that I have 10.5-19.1 (14.8 ± 4.3) for the cats and 15.1-22.9 (19.0 ± 3.9) for the dogs. Do the ranges overlap? Yes.

What conclusion do we draw?
That we will NOT reject the Null hypothesis and state that the average height of cats and dogs is the same.

Are we comfortable with this? If you review the table presented above – now we still have a FALSE Ho and this time around we did NOT reject the Null hypothesis – leading us to committing a Type II or Beta error.

A Type II error is directly related to the POWER of the test. By definition, the power of a statistical test, is the probability that the test will correctly reject the null hypothesis when it is false.

POWER is related to a number of factors:

sample size
effect size – or the size of the difference between treatment groups
variation of our outcome variable
level of significance – p-value

Consider our example above, what factors could be change to increase the POWER of our test and ensure that we won’t see similar results to the second time we collected data?

Sample size

There are several ways to calculate the POWER of a statistical test. SAS has 2 PROCs – Proc POWER and Proc GLMPOWER. Review the SASsy Fridays post on these. There are many links to online calculators as well. Please choose one that is defendable.

Crimes of Statistics: Replication or Sub-samples?

Now that we are all comfortable and confident about what the Experimental Unit is in our research – we now need to think about replication. Are we currently taking proper replicates? Or are we taking sub-samples or pseudoreplicates?

Join us on October 3rd, to discuss replicates, sub-sampling, and more. To prepare for this session, please take a few minutes and review the following papers:

Lee, C. and Rawlings, J.O. 1982. Design of experiments in growth chambers – Uniformity trials in the North Carolina State University Phytotron. Crop Science: 22: 551-558 doi:10.2135/cropsci1982.0011183X002200030028x
Hurlbert, S.H. 1984. Pseudoreplication and the design of ecological field experiments. Ecological Monographs: 54(2): 187-211 doi: 10.2307/1942661

During this 50 minute session, we will review the definition of an experimental unit, subsampling unit, and discuss the papers above.

Date: Tuesday, October 3
Time: 10am – 10:50am
Location: OAC Boardroom (Rm 104 Johnston Hall)

Presentation Notes – Sampling Unit vs Experimental Unit

Crimes of Statistics: The Experimental Unit

One of the building blocks of any experimental design and subsequent statistical analysis is the “Experimental Unit”. What is it? Are you sure that you know what your experimental unit is in your own research project?

Join us on September 5th, to discuss the experimental unit and see how it relates to your experimental design and subsequent analysis. To prepare for this session, please take a few minutes and read a paper written by Dwight S. Fisher, “Defining the experimental unit in grazing trials.” (Journal of Animal Science 2000 77: E-Suppl: 1-5 doi:10.2527/jas2000.00218812007700ES0006x). During this 50 minute session, we will review the definition of an experimental unit, discuss the paper above, and determine how this relates to your own research.

Date: Tuesday, September 5
Time: 10am – 10:50am
Location: OAC Boardroom (Rm 104 Johnston Hall)

Presentation Notes – Experimental Unit

Crimes of Statistics: Info Session

The first session of the Crimes of Statistics COP will be held on August 22, 2017 at 10am in the OAC Boardroom. Goal is to create a list of topics to be discussed this semester. I have a couple already but am keen to hear your thoughts and feedback. If you are unable to attend this session, please add your topics as a comment to this post.

Name

ARCHIVE: Communities of Practice: Coming Fall 2017

“Communities of practice are groups of people who share a concern or a passion for something they do and learn how to do it better as they interact regularly.” – wenger-trayner.com

The OAC Stats Support Service will facilitate Communities of Practice (COP) to engage the OAC research community and assist with the statistical analyses and statistical software. Our researchers use a variety of statistical approaches and statistical software packages to conduct their research, by meeting, sharing perspectives, and learning new aspects of our software and/or statistical approaches, as a community, we can create enriched learning environments for all.

Fall 2017, will see the creation and revitilization of four COPs:

SASsy Fridays
Crimes of Statistics
OAC R Users Group
OAC Data Visualization

SASsy Fridays

SASsy Fridays started as a COP in W14 in response to the growing interest of SAS-specific topics beyond what was being taught in the workshops. If you use SAS and are interested in learning and sharing new approaches to using the software or new statistical approaches in SAS, this is the COP for you! For past topics please review the SASsy Fridays blog. If you have a topic you would like to present or would like more information about, please email oacstats@uoguelph.ca. SASsy Fridays sessions will take place in the Crop Science Lab Rm 121A on the following dates and times:

Friday, October 13 from 12:30-1:20 p.m.
Friday, October 27 from 12:30-1:20 p.m.
Friday, November 10 from 12:30-1:20 p.m.
Friday, November 24 from 12:30-1:20 p.m.
Friday, December 8 from 12:30-1:20 p.m.

Many of us conduct experiments and run the appropriate statistical analysis, but sometimes we can get caught up in questioning the basics of the theoretical background. Topics such as replication, sampling, power, p-values, and many more. This COP will meet to discuss these and other topics. A short presentation on the topic du jour will be followed by a discussion of situations you may have encountered. The Crimes of Statistics COP will meet in the OAC Boardroom (Johnston Hall) on the following dates and times:

Tuesday, August 22 from 10:00-10:50 a.m.
Tuesday, September 5 from 10:00-10:50 a.m.
Tuesday, October 3 from 10:00-10:50 a.m.
Thursday, November 2 from 10:00-10:50 a.m.
Thursday, November 30 from 10:00-10:50 a.m.
Tuesday, December 12 from 10:00-10:50 a.m.

The first meeting on August 22 will be an information gathering session. Please bring any topics you would like to see discussed to this session.

OAC R Users Group

R is growing in popularity and is gaining international acceptance in the research community. The goal of this group will be to exchange knowledge about R-packages and R-libraries that your research field or your lab uses. A short presentation or demonstration of practical application of an R-package or R-library will be followed by questions and exploration of other uses for the presented material. The OAC R User Group meetings will take place in Crop Science Lab Rm 121A on the following dates and times:

Friday, October 20 from 12:30-1:20 p.m.
Friday, November 3 from 12:30-1:20 p.m.
Friday, November 17 from 12:30-1:20 p.m.
Friday, December 1 from 12:30-1:20 p.m.
Friday, December 15 from 12:30-1:20 p.m.

Data Visualization

You have been collecting data for a project and now it’s time to do something with it! What do you do? How do you present it? Should it be a table? A graph? A chart? This COP will discuss different ways of presenting data, the pros and cons of different formats, and will encourage the community to demonstrate their favourite data visualization formats. The Data Visualization COP will meet in the OAC Boardroom (Johnston Hall) on the following dates and times:

Tuesday, October 17 from 12:00-12:50 p.m.
Tuesday, October 31 from 12:00-12:50 p.m.
Tuesday, November 14 from 12:00-12:50 p.m.
Tuesday, November 28 from 12:00-12:50 p.m.
Tuesday, December 12 from 12:00-12:50 p.m.