## Lesson abstract

The task is to predict the number of times selected letters occur in the first 5000 letters from “Alice in Wonderland”. Students take samples from which to predict the number of times selected letters occur. From observed frequencies in the sample, they calculate relative frequencies as fractions, decimals and percent. They use the percent frequency to make the prediction of real frequency in all 5000 letters. They stop when they think the sample is large enough for the prediction to be reliable, and then compare with the actual distribution.

## Mathematical purpose (for students)

To make good predictions by collecting a sample, the sample has to be a random sample and sufficiently large.

## Mathematical purpose (for teachers)

This lesson builds on the important ideas from Lesson 2 on probability 30 300 3000. That lesson demonstrated that the more times a dice is thrown, the closer the percent frequency calculated from the experiment is likely to be to the real probability (in that case 1/6). This important phenomenon, part of the ‘law of large numbers’, is the basis of surveys. In this lesson this idea is used in reverse: students find the percent frequency of vowels (for example) in a sample from the text, and if the sample is random and large enough, we can assume that it is close to the real probability. Students calculate the percent frequencies progressively (as fractions, decimals and percent) as their sample size grows, and use the results to predict the number of vowels in the entire text. Finally, students compare their predictions against the actual number and reflect on their method.

## At the end of this lesson, students will be able to:

- Collect a random data sample.
- Make predictions about a population based on their sample.

We value your feedback after this lesson.