Box and Whiskers Plots are some of the most used types of graphs in statistics - for seasoned data analysts. For everyone else, they are little known and little used. Which is a shame, because they can convey a lot of information about your data - information that you need to know to make important decisions!

And they're really not that difficult to create or understand, either.

If you want to engage your audience and produce well-crafted Box and Whiskers Plots that can inspire your audience, there are only a few simple data visualisation best practices that you need to follow.

And they are all here!

Then I’ll show you which data goes where.

We’ll move on to how to style your Box and Whiskers Plots and review them with a critical eye to decide what should go on your graph, and – more importantly – what you should take off.

By the time you’ve read this dataviz guide, you’ll know more about plotting charts than pretty much everyone around you!

## Box and Whiskers Plot

### What is a Box and Whiskers Plot?

A Box and Whiskers Plot is a way of describing a set of numerical data. The Box is a rectangle drawn to represent the second and third quartiles, while the Whiskers show the lower and upper (first and fourth) quartiles.

### What does a Box and Whiskers Plot show?

A Box and Whiskers Plot is a way of summarising a set of numerical data or comparing sets of numerical data. The Box and Whiskers Plot is used to show the shape of the distribution, the central value (usually as a line within the Box to represent the median), the variability and outliers.

### Types of Box and Whiskers Plots

There are 2 main types of Box and Whiskers Plot:

**Simple Box and Whiskers Plot**.

A simple Box and Whiskers Plot contains the descriptive statistics of a single numerical variable. It is useful to understand the data distribution of a single variable.

**Multiple Box and Whiskers Plot**.

A Multiple Box and Whiskers Plot contains the descriptive statistics of more than one numerical variable. It is also useful to understand the data distribution of each variable, but the greater strength is in the comparison of variables – it can be used to show variables that are distributed differently (or the same).

### Box and Whiskers Plot Example

For example, if you wanted to investigate the effect of education on family size, you could use a Box and Whiskers Plot, like this:

### Box and Whiskers Plot Example – Interpretation

Interpreting a Box and Whiskers Plot is both simple and difficult at the same time. It is easy to interpret the data you can *see*, but difficult to spot what’s not there.

Considering a single Box and Whiskers Plot, you can identify the median and the upper and lower quartiles (the box containing the horizontal line), which contains the middle half of all the data in your variable. You can also identify the mean (the cross within the box) so you can compare the mean with the median, and you can identify the limits of ‘reasonable’ data (the whiskers) and the outliers (the crosses outwith the whiskers).

Comparing Box and Whiskers plots, you can identify whether the data are similar (they occupy similar parts of the y-axis) or are different.

If you look at the Box and Whiskers plot example above, you can see that the less educated mothers tended to have more children (median of 10) than those with more education (median of 4). Half of the less educated mothers had between 3 and 13 children (the limits of the box), while half of the more educated mothers had between 2 and 6 children.

Note that the Less Educated Box and Whiskers plot is skewed towards having more children (the median is above centre, while the upper whisker is shorter than the lower whisker). The More Educated plot is more likely to be skewed towards having fewer children (the upper whisker is longer than the lower, and there are a few large outliers). Neither sets of data are normally distributed.

So, is this difference statistically significant? Although the medians appear quite different, and the median for the Less Educated data is at the limits of the Whiskers of the More Educated data, it is difficult to be certain without running a formal statistical analysis. More work is required, but it is certainly possible!

### When To Use a Box and Whiskers Plot

You use Box and Whiskers Plots when you have multiple numerical data sets that are related to each other. They are used to provide high-level information at-a-glance and make comparisons between their distributions.

### When To Avoid a Box and Whiskers Plot

Box and Whiskers Plots are great when the data are ‘well behaved’, such as when the data are normally-distributed or even highly-skewed. They lack the ability to show details of a data distribution’s shape, though, and are not very good at representing data that are ‘lumpy’ – if there are odd shapes to the distribution (e.g. a bi-modal distribution).

### Box and Whiskers Plots - Best Practice

Whilst Box and Whiskers Plots aren't very well known outside data analyst circles, they are among the most used types of graphs in statistics.

Hopefully, you now have a much better understanding of Box and Whiskers Plots, when to use them, what type of chart to use to compare data, and how to present your chart to inspire those around you to change the world for the better.

