Transum Software

Comparing Data

Compare two distributions using appropriate measures of central tendency and spread.

  Menu   Level 1 Level 2 Level 3 Level 4 Level 5 Level 6 Level 7 Level 8   Exam     Help     More  

This is level 1: Summary Statistics. If you are infamiliar with quartiles try this exercise first.

For each statement click on either True (T), False (F), or Cannot be Determined (C).

1

In the village of Hillview there are two housing estates, Eastville and Westland. Data is collected about the ages of the people living in each estate.

For the people living in Eastville, the statistics are: Lower quartile 18, Median 45, Upper quartile 57.

For the people living in Westland, the statistics are: Minimum 5, Lower quartile 15, Median 43, Upper quartile 54, Maximum 67.

Map of Hillview

Click the help tab above if you do not understand any of the words used in this description.

The median age in Eastville is lower than the median age in Westland.

The interquartile range in both estates is the same.

All of the people in Westland are older than the youngest person in Eastville.

The mean age of a person in Eastville is more than that of a person in Westland.

More people live in Westland than in Eastville.

2

A wildlife researcher measured the lengths, in centimetres, of marsh snakes found in two different swamps: Reedmere and Willowfen.

For the snakes found in Reedmere, the statistics are: Minimum 42, Lower quartile 76, Median 91, Upper quartile 118, Maximum 190.

For the snakes found in Willowfen, the statistics are: Minimum 50, Lower quartile 72, Median 97, Upper quartile 112, Maximum 136.

For this question, an outlier is a value that is more than 1.5 times the interquartile range above the upper quartile or below the lower quartile.

Measuring snakes

The median length of the snakes found in Willowfen is greater than the median length of the snakes found in Reedmere.

The range of the snake lengths in Reedmere is smaller than the range of the snake lengths in Willowfen.

The interquartile range of the snake lengths in Reedmere is greater than the interquartile range of the snake lengths in Willowfen.

Using the rule given above, the longest snake found in Reedmere is an outlier.

More snakes were measured in Willowfen than in Reedmere.

Check

Can you correct your mistakes in order to get full marks?

The check button will only indicate if all five statements have been correctly identified. There is no indication whether individual choices of the statements' truthfulness are correct or not. This is Comparing Data level 1. You can also try:
Level 2 Level 3 Level 4 Level 5 Level 6 Level 7 Level 8

Instructions

Try your best to answer the questions above. Type your answers into the boxes provided leaving no spaces. As you work through the exercise regularly click the "check" button. If you have any wrong answers, do your best to do corrections but if there is anything you don't understand, please ask your teacher for help.

When you have got all of the questions correct you may want to print out this page and paste it into your exercise book. If you keep your work in an ePortfolio you could take a screen shot of your answers and paste that into your Maths file.

Why am I learning this?

Mathematicians are not the people who find Maths easy; they are the people who enjoy how mystifying, puzzling and hard it is. Are you a mathematician?

Comment recorded on the 16 March 'Starter of the Day' page by Mrs A Milton, Ysgol Ardudwy:

"I have used your starters for 3 years now and would not have a lesson without one! Fantastic way to engage the pupils at the start of a lesson."

Comment recorded on the 1 August 'Starter of the Day' page by Peter Wright, St Joseph's College:

"Love using the Starter of the Day activities to get the students into Maths mode at the beginning of a lesson. Lots of interesting discussions and questions have arisen out of the activities.
Thanks for such a great resource!"

Each month a newsletter is published containing details of the new additions to the Transum website and a new puzzle of the month.

The newsletter is then duplicated as a podcast which is available on the major delivery networks. You can listen to the podcast while you are commuting, exercising or relaxing.

Transum breaking news is available on Twitter @Transum and if that's not enough there is also a Transum Facebook page.

Featured Activity

Pentransum

Pentransum

Answer multiple choice questions about basic mathematical ideas. If you get a number of questions correct you will be invited to post a question of your own. The bank of questions grows larger every day.

Answers

There are answers to this exercise but they are available in this space to teachers, tutors and parents who have logged in to their Transum subscription on this computer.

A Transum subscription unlocks the answers to the online exercises, quizzes and puzzles. It also provides the teacher with access to quality external links on each of the Transum Topic pages and the facility to add to the collection themselves.

Subscribers can manage class lists, lesson plans and assessment data in the Class Admin application and have access to reports of the Transum Trophies earned by class members.

If you would like to enjoy ad-free access to the thousands of Transum resources, receive our monthly newsletter, unlock the printable worksheets and see our Maths Lesson Finishers then sign up for a subscription now:

Subscribe

Go Maths

Learning and understanding Mathematics, at every level, requires learner engagement. Mathematics is not a spectator sport. Sometimes traditional teaching fails to actively involve students. One way to address the problem is through the use of interactive activities and this web site provides many of those. The Go Maths page is an alphabetical list of free activities designed for students in Secondary/High school.

Maths Map

Are you looking for something specific? An exercise to supplement the topic you are studying at school at the moment perhaps. Navigate using our Maths Map to find exercises, puzzles and Maths lesson starters grouped by topic.

Teachers

If you found this activity useful don't forget to record it in your scheme of work or learning management system. The short URL, ready to be copied and pasted, is as follows:

Alternatively, if you use Google Classroom, all you have to do is click on the green icon below in order to add this activity to one of your classes.

It may be worth remembering that if Transum.org should go offline for whatever reason, there is a mirror site at Transum.info that contains most of the resources that are available here on Transum.org.

When planning to use technology in your lesson always have a plan B!

Do you have any comments? It is always useful to receive feedback and helps make this free resource even more useful for those learning Mathematics anywhere in the world. Click here to enter your comments.

Transum.org is a proud supporter of the kidSAFE Seal Program

© Transum Mathematics 1997-2026
Scan the QR code below to visit the online version of this activity.

This is a QR Code

https://www.Transum.org/go/?Num=1179

Description of Levels

Close

Level 1 - Summary Statistics

Level 2 - Raw Data

Level 3 - Frequency Charts

Level 4 - Stem-and-Leaf

Level 5 - Box Plots

Level 6 - Cumulative Frequency

Level 7 - Histograms

Level 8 - Mixed Representations

Exam Style Questions - A collection of problems in the style of GCSE or IB/A-level exam paper questions (worked solutions are available for Transum subscribers).

More on this topic including lesson Starters, visual aids, investigations and self-marking exercises.

Answers to this exercise are available lower down this page when you are logged in to your Transum account. If you don’t yet have a Transum subscription one can be very quickly set up if you are a teacher, tutor or parent.

Log in Sign up

Comparing Two Distributions

This exercise asks you to choose which of the given statements are true, false, or cannot be determined. In exams you are often asked to come up with your own statements comparing two distributions.

Whatever diagram the data is presented in, a good comparison does two things:

  1. Compares a measure of average (usually the median) — which set of values is typically higher?
  2. Compares a measure of spread (usually the interquartile range, or the range) — which set is more consistent?

Always write your comparison in context and comparatively. For example: “The median journey time on the west route was longer than on the east route, but the east route times were more spread out.” A single sentence that only describes one distribution on its own is not a comparison.

It is just as important to know what a diagram cannot tell you. Summary diagrams often hide the individual data values, the exact number of items, or how the data is distributed within a class. If a question asks for something the diagram does not show, the honest answer is “cannot be determined”.

A note on method: the counting method described above is for discrete data where you have every value. For grouped or continuous data (cumulative frequency graphs and histograms) we instead estimate the quartiles at the \(\frac{n}{2}\), \(\frac{n}{4}\) and \(\frac{3n}{4}\) positions.

Raw data

For discrete distributions, there is no universal agreement on selecting the quartile values, but for the purpose of this exercise, we'll use the most popular method:

  1. The median is the value in the middle of the data set when it is arranged in order of size.
  2. Use the median to divide the ordered data set into two halves. The median becomes the second quartile.
  3. If there are an odd number of data points in the original ordered data set, do not include the median (the central value in the ordered list) in either half.
  4. If there are an even number of data points in the original ordered data set, split this data set exactly in half.
  5. The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data.

Read more about the methods for finding quartiles of discrete data in the December 2024 Transum Newsletter.

The interquartile range is the difference between the upper and lower quartiles: \( IQR = Q_3 - Q_1 \).

Outliers are the points lying beyond the upper boundary of \(Q_3 + 1.5 \times IQR\) and the lower boundary of \(Q_1 - 1.5 \times IQR\).

Frequency charts

A frequency chart or table records how many times each value (or category) occurs. Find the total \(n\) by adding the frequencies. The mode is the value with the highest frequency. To find the median, count along the frequencies to the middle position; the quartiles are found the same way using the method above.

You can read: mode, median, quartiles, range and the exact value of \(n\). You cannot read: anything not recorded in the chart.

Stem and leaf diagrams

A stem and leaf diagram keeps every original data value, already arranged in order, so it is one of the most informative displays. Count the leaves to find \(n\), then count in from the ends to locate the median and quartiles exactly. Remember to read the key so you interpret each value correctly.

You can read: every individual value, the exact median, quartiles, range, mode and any outliers. You cannot read: nothing is hidden — this diagram shows the full data set. A back-to-back stem and leaf diagram lets you compare two sets directly.

Box plots (box and whisker diagrams)

A box plot displays the five-number summary: minimum, lower quartile, median, upper quartile and maximum. The length of the box is the interquartile range and the distance between the whisker ends is the range. The position of the median line inside the box, and the relative whisker lengths, indicate the skew of the data.

You can read: median, quartiles, IQR, range, and the fact that roughly a quarter of the data lies in each section. You cannot read: the number of data values \(n\), the mean, or any individual value. You cannot tell how many items lie in a particular interval — only that 25% fall between each pair of marks.

Cumulative frequency graphs

A cumulative frequency graph (ogive) plots a running total against the upper boundary of each class. Read the median across from \(\frac{n}{2}\), the lower quartile from \(\frac{n}{4}\) and the upper quartile from \(\frac{3n}{4}\); you can also estimate how many values fall below (or above) any chosen amount. When two curves share the same axes, the curve further to the right represents larger values, and a steeper curve means the data is more tightly concentrated (smaller spread). Where two curves cross, the same number of items from each set lies below that value.

You can read: estimates of the median, quartiles, percentiles, IQR and the number of values below a given amount. You cannot read: individual values, the exact mean, or how the data is distributed within a single class.

Histograms

When the class widths are unequal, the vertical axis must be frequency density, not frequency:

\[ \text{frequency density} = \frac{\text{frequency}}{\text{class width}} \qquad \text{so} \qquad \text{frequency} = \text{frequency density} \times \text{class width} \]

This means it is the area of each bar, not its height, that represents the frequency. The tallest bar is not necessarily the one containing the most data — always multiply by the class width. The modal class is the bar with the greatest frequency density. Quartiles can be estimated by finding the values that divide the total area into halves and quarters.

You can read: the frequency in each class (frequency density × width), the total \(n\), the modal class, and estimates of the median and quartiles. You cannot read: individual values, or how the data is spread within a class.

Scatter graphs

A scatter graph plots paired data, with each point showing two measurements for the same item (for example, one student’s statistics mark and physics mark). To compare the two distributions, treat each axis separately: read all the horizontal values as one data set and all the vertical values as another, then compare their medians and quartiles in the usual way.

Be careful to keep two ideas apart. Comparing the distributions asks which subject’s marks were higher and more spread out. Correlation and a line of best fit describe the relationship between the two variables — a different question. A strong upward trend does not by itself tell you which set of marks was higher; for that you must compare the two sets of values directly.

You can read: every individual pair of values, so the median, quartiles and range of each variable can be found exactly. You cannot read: nothing is hidden, but remember that the line of best fit answers a question about relationship, not about comparing the two distributions.


Some questions have a hint that can help you find a way to solve the problem. If you see a  button, click it to reveal the hint.

Answers to this exercise are available lower down this page when you are logged in to your Transum account. If you don’t yet have a Transum subscription one can be very quickly set up if you are a teacher, tutor or parent.

Log in Sign up

Close