The Hive - Learn, Help, Connect
Dirty Data Dojo - Data Validation

Excel

Python

R

Just because you have a perfectly clean, analysis-ready dataset, it doesn't mean that your data are sensible, fit-for-purpose and meet real-life rules.

In this course you're going to learn how to make these checks, run preliminary analyses on your data so you understand the underlying 'story' of the data - even before your 'real' analyses have begun.

I’ll teach you how to check that your data follow these rules.

I'll also teach you how to identify and remove outliers automatically, so you know that the results you get are real, and not skewed by rogue datapoints.

The steps you’ll learn in this course are very simple to follow, but are extremely effective, so you’ll know that you’re getting to the true story of the data, saving you weeks of misery!


Video lessons

Articles

Downloadable resources

Certificate of completion

Interactive experience

Perfect for beginners


  • Description
  • Content
  • Excel
  • Python & R
  • Outcomes
  • FORUM
  • CertificatE
  • 14D2C2

This course is about teaching you how to check that your data are valid, sensible and fit-for purpose by making sure that they meet real-life rules.

Here’s a quick rundown of what you’re going to learn:

Here's what you'll learn:

1.

you’ll learn how to remove unwanted text data from your dataset

2.

you’ll learn how to check that your data are sensible and fit-for-purpose

3.

you’ll learn how to remove outliers with one super-slick ninja move

Your Curriculum

1

Introduction to Data Integrity

An introduction to this course and to the 14 Day Data Cleaning Challenge!

Open Access

CHAPTER HIGHLIGHTS

Introduction to Data Validation

2

Removing Unwanted Text Entries

In this chapter you'll learn how to remove unwanted text entries that contaminate your data

Free Plan

Premium

CHAPTER HIGHLIGHTS

How to Remove Unwanted Text Entries

3

Check that Your Numerical Data are Sensible

The focus in this chapter is on how to check that your data are sensible and fit real-life rules

Premium

CHAPTER HIGHLIGHTS

Check that Your Numerical Data are Sensible

4

How to Remove Statistical Outliers

The focus in this chapter is on automatically identifying and removing outliers that will skew the underlying 'story' of your data

Premium

CHAPTER HIGHLIGHTS

How to Remove Statistical Outliers

5

Course Recap

In this chapter you'll recap all the techniques you've learned

Premium

CHAPTER HIGHLIGHTS

A Recap of Everything in This Course

Get Started

It's free to get started, with no obligation to buy or register.

Try it now, decide later!

Just click the link on the right to go to the first lesson...

How to analyse categorical survey data in Excel and in R
Chi-Squared Innovations
0
Would love your thoughts, please leave a comment!x
()
x