Download Excel File: Dice experiment confidence limits 2024.xlsm

Objectives

The objective of this workshop is to explore the relationship between the size of a random sample and the uncertainty in estimates of probability made from that sample.

In this first task, we'll look at the uncertainty in estimating the likelihood of each side of a 6-sided die resulting from a roll, as captured in confidence intervals.   We'll explore an existing spreadsheet that generates random samples of die rolls (of fair dice!!) for various sample sizes, estimates probability of a roll showing each side, and compares to the known probability of 1/6. 

Consider why I say that probability is known in these experiments!


Steps:

Open spreadsheet titled “dice experiment confidence limits.xlsm” This is the same spreadsheet used in the exercise on Monday morning, with information about confidence intervals added.  You should find yourself on the tab labeled “12”. (If not, go to tab "12"!!)  Look at the orange area, starting in cell C26. This shows the computation of a 90% confidence interval based on the Normal Distribution.  For the Normal Distribution to be an adequate representation of the uncertainty, N*p > 5 must be true, where N = sample size and p = probability of success (probability of rolling a 4). 


Question 1: Is the requirement met?

No, for N=12, N*p = 2.  Therefore, the Normal approximation is not adequate here.


Look at the lower and upper edge of the 90% confidence interval estimated from the Normal Distribution for sample size N=12, noted in cells E31:E32.


Question 2: What problems do you see?

The lower edge of the 90% interval is negative, which is not a possible probability.  The Normal distribution does not hold true because, with this sample size, it is possible to have an estimate with more positive error than negative error.  This asymmetry of error results because, while 2 fours is the expected value (mean) from 12 rolls, many more than 2 fours can occur, but less than 0 fours cannot.


The confidence interval is plotted on the relative frequency figure with dashed orange lines.  Hit F9 several time to draw new samples and see how often the estimate of the probability of rolling a 4 (or the probability of rolling any value) exceeds the interval.  The confidence interval is also plotted with the results of the 20 repeats of the experiment.  Consider the results and the interval. 

Inspect the confidence interval computation for the other tabs for sample size N = 100, 1000, and 10000 rolls.


Question 3: Is the Normal approximation of the confidence interval adequate for the larger sample sizes?

N*p > 5 for all larger sample sizes, meaning the Normal approximate is an adequate approximation.


The tabs contain the result of only 20 estimates of probability of rolling a 4 for each sample size. This is a small sample of outcomes.  (NOTE, we're talking about a sample of samples, here!)  The tab “experiment” contains the results of 1000 estimates (from 1000 samples) of each sample size of interest.  The 1000 estimates are in 1000 rows, with a column for 12, 100, 1000 and 10000.  Near the TOP of the tab, to the right of these columns is the histogram computation and plot of all the estimates. 


Question 4: What is special about the estimates for sample size N=12?  What causes this result?

Estimated values of p are widely spaced (coarse).  This is because there are a limited number of estimates possible, computed by 1/12, 2/12, 3/12, …, 12/12.  Larger sample sizes have less coarse estimates.


Note that the estimates of the probability of rolling a 4 get tighter around the correct value of 1/6 (or, 0.1667) with the larger sample size.

Scroll to the bottom of the “experiment” tab to see the statistics resulting from the 1000 samples of each size, as well as the expected statistics, based on the parameters of the sampling distribution. Sample statistics of mean, standard deviation and skew of the 1000 samples are noted, as well as the estimates of the 90% confidence intervals. Mean, standard deviation and skew are plotted.  Note that the mean is correct (ie, mean of sampling distribution = actual p of 1/6) for all sample sizes. 


Question 5: What do the means say about relative frequency as an estimator of the probability of rolling a 4?

The means of the estimates are correct, even for N=12.  This means the estimator is UNBIASED.


Also note that standard deviation decreases as sample size increases.


Question 6: What does the decreasing standard deviation with increasing sample size say about relative frequency as an estimator of the probability of rolling a 4? 

The fact that the standard deviations decrease with N means the estimator is CONSISTENT.


Finally, note that the skew of the samples decreases with sample size.


Question 7: What is the likely cause of the decrease in skew?

The skews decrease because the restriction of not being able to roll less than 0 fours becomes less relevant for a larger sample, even though it is very relevant with N=12.  (i.e., for N=12, correct = 2/12, but 6/12 is possible while less than 0/12 is not possible.  For N=100, correct = 17/100, and values range from 8/100 to 28/100 in the 1000 samples.  So, less than 0 is well outside the results and its impossibility doesn't have an impact.)


To the right, in green, are the sampling distribution parameters of mean and standard deviation of the estimate of probability, as well as the 90% confidence intervals.


Question 8: How well do the experimental sample estimates match the distribution parameters?  In which case might you prefer the experimental sample estimates?

The means and standard deviations match well.  The confidence intervals match well except for N=12.  For N=12, the interval developed from the 1000 estimates is better, because it is not based on the Normal approximation, and reflects estimates than can actually be made (i.e., no negatives!).

Close the spreadsheet and move on to task 2.  Task 2