4 Types of Data in Statistics. What Are They?


Did you know that there are only 4 types of data in statistics? Do you know what they are and what you can do with them?

Have you ever looked at your data and wondered how and where to get started? If you don't know the difference between quantitative data and qualitative data then you're in the right place. Here is our guide to statistical data types and how to deal with them...

More...

Data Types 101 – A Guide to Quantitative Data, Qualitative Data and How to Distinguish Between Them

4 Types of Data in Statistics

Ultimately, there are just 2 classes of data in statistics that can be further sub-divided into 4 statistical data types. You may have heard phrases such as 'ordinal data', 'nominal data', 'discrete data' and so on. But have you ever wondered just what they are and what they've got to do with your research and your data?

Surely they are all just fancy words made up by mathematicians and statisticians to make them sound important, aren't they?

Well actually, they are pretty important, because if you know what types of data you have, then you know ​which maths and stats operations you're allowed to use on your data.

Get that wrong and you're skating on pretty thin ice - sooner or later you're going to make your boss rather unhappy, and nobody wants that, do they?

So take a deep breath and let's go.

I promise this will all be quite painless...

​Types of Data in Statistics: Quantitative Data and Qualitative Data

​When it all boils down to it, all data ​are either measured ​or ​are an observed feature of interest​​.

​Measured data are measured with some kind of measuring implement - ruler, jug, weighing scales, stop-watch, thermometer and so on. Observed data are placed into categories - gender (male, female), health (healthy, sick), opinion (agree, neutral, disagree).

​So to put it in simple terms:

  • Quantitative data ​are measured
  • Qualitative data ​are categorised
A stopwatch - used for measuring quantitative data
A stopwatch - used for measuring quantitative data

Tools for measuring quantitative data

Pig - qualitative data
Sheep - qualitative data
Cow - qualitative data

Qualitative data are categorised

​The following infographic might help you to understand the 4 types of data in statistics and to visualise what we're discussing. We'll refer to it throughout...

Infographic- 4 Types of Data in Statistics

Infographic: Quantitative Data is measured, Qualitative Data is categorised

No doubt you've noticed that quantitative data and qualitative data can be sub-divided into 4 further classes of statistical data types; Ratio DataInterval DataOrdinal Data and Nominal Data.

You can figure the difference by asking 3 questions:

  • Ordered - Can some sort of progress be detected between adjacent data points or categories? Can the data be ordered meaningfully?
  • Equidistant - Is the distance between adjacent data points or categories consistent?
  • Meaningful Zero - Does the scale of measurement include a unique, non-arbitrary zero value?

​If we can answer these 3 questions for each of our types of data then we can correctly determine its class.

​4 Types of Data in Statistics: "you can distinguish between all types of data by asking 3 simple questions" @eelrekab @chi2innovations #dataanalytics

Click to Tweet

Types of Data in Statistics: 

Nominal Data

Nominal data is the class of data type for data that has the following properties:

Nominal Data are arranged in unordered categories

Nominal Data ​are observed, not measured, ​are unordered, non-equidistant and ​have no meaningful zero

We can differentiate between categories based only on their names, hence the title 'nominal' (from the Latin nomen, meaning 'name').

Examples of nominal data include:

  • ​Gender (male, female)
  • ​Nationality (British, American, Spanish,...)
  • ​Genre/Style (Rock, Hip-Hop, Jazz, Classical,...)
  • Favourite colour (red, green, blue,...)
  • Favourite animal (aarvark, koala, sloth,...)
  • Favourite spelling of 'favourite' (favourite, favorite)

​The only mathematical or logical operations we can perform on nominal data is to say that an observation is (or is not) the same as another (equality or inequality). We can determine the most common item by finding the mode (do you remember this from High School classes?).

Other ways of finding the middle of the class, such as median or mean make no sense because ranking is meaningless for nominal data.

Types of Data in Statistics: 

Ordinal Data

Ordinal data have the following properties:

Types of Data in Statistics:  Ordinal Data

Ordinal Data ​are observed, not measured, ​are ordered but non-equidistant and ​have no meaningful zero

Their categories can be ordered (1st, 2nd, 3rd, etc. - hence the name 'ordinal'), but there is no consistency in the relative distances between adjacent categories.

Examples of ordinal data include:

  • ​Health (healthy, sick)
  • Opinion (agree, mostly agree, neutral, mostly disagree, disagree)
  • Tumour Grade (1, 2, 3)
  • Tumour Stage (I, IIA, IIB, IIIA, IIIB, etc.)
  • Time of day (morning, noon, night)

​Mathematically, we can make simple comparisons between the categories, such as more or less healthy/severe, agree more or less, etc.. Since there is an order to the data we can rank them and compute the median (or mode, but not the mean) to find the central value.

It is interesting to note that in practice some ordinal data are treated as interval data. Tumour Grade is a classic example in healthcare, because the statistical tests that can be used on interval data (they meet the requirement of equal intervals) are much more powerful than those used on ordinal data. This is OK as long as your data collection methods ensure that the equidistant rule isn't bent too much.

Types of Data in Statistics: 

Interval Data

Interval data have the following properties:

Types of Data in Statistics:  Interval Data

Interval Data ​are measured and ordered with equidistant items, but ​have no meaningful zero

Interval data ​are ordered, can be continuous (have an infinite number of steps) or discrete (organised into categories), and the degree of difference between items is meaningful (their intervals are equal), but not their ratio.

Examples of interval data include:

  • ​Temperature (°C or F, but not Kelvin)
  • Dates (1066, 1492, 1776, etc.)
  • Time interval on a 12 hour clock (6am, 6pm)

​Although interval data can appear very similar to ratio data, the difference is in their defined zero-points. If the zero-point of the scale has been chosen arbitrarily (such as the melting point of water or from an arbitrary epoch such as AD) then the data cannot be on the ratio scale and must be interval.

Mathematically we may compare the degrees of the data (equality/inequality, more/less) and we may add/subtract the values, such as '20°C is 10 degrees hotter than 10°C' or '6pm is 3 hours later than 3pm'. However, we cannot multiply or divide the numbers because of the arbitrary zero, so we can't say '20°C is twice as hot as 10°C' or '6pm is twice as late as 3pm'.

The central value of interval data is typically the mean (but could be the median or mode). We can also express the spread or variability of the data using measures such as the range, standard deviation, variance and/or confidence intervals​.

Types of Data in Statistics: 

Ratio Data

Ratio data have the following properties:

Types of Data in Statistics:  Ratio Data

Ratio Data ​are measured and ordered with equidistant items and a meaningful zero

As with interval data, ratio data can be continuous or discrete, and differs from interval data in that there is a non-arbitrary zero-point to the data. Examples include:

  • ​Age (from 0 years to 100+)
  • Temperature (in Kelvin, but not °C or F)
  • Distance (measured with a ruler or other such measuring device)
  • Time interval (measured with a stop-watch or similar)

​For each of these examples there is a real, meaningful zero-point. ​The age of a person​, absolute zero​, distance measured from a pre-determined point​ or time​ all have real zeros. With real zero-points we can multiply and divide the numbers - a ​6 year old is ​half the age of a ​12 year old and  ​matter at 100K has ​half the energy of matter at 200K. Similarly, ​the distance from Barcelona to Berlin is half the distance as Barcelona to Moscow, and ​it takes me twice as long to run the 100m as Usain Bolt but only half the time of my Grandad.

​Ratio data are the best to deal with mathematically (note that I didn't say easiest...) because all possibilities are on the table. We can find the central point of the data by using any of the mode, median or mean (arithmetic, geometric or harmonic) and use all of the most powerful statistical methods to analyse the data. As long as we choose correctly we can be really confident that that we are not being misled by the data and our interpretations are likely to have merit.

​Your Next Step

​I hope you've enjoyed this blog post and have found something useful in it.

If you're interested in learning how to ​prepare your data for analysis with the minimum of fuss, ​we've created a series of video courses dedicated to data collection, data cleaning and data preparation to get your data analysis-ready in double quick time.

​Check out the data preparation course right here, where there will be lessons on exactly how to identify and use each of the 4 types of data in statistics:

Slider
Ever looked at your data and wondered how and where to get started? If you don't know the difference between quantitative data and qualitative data then you're in the right place. Here is our guide to data types and how to deal with them... #datatips #dataData Types 101 #datatips
Do NOT follow this link or you will be banned from the site!