Dirty Data Dojo - Data Cleaning

Excel

Python

R

Data cleaning is a serious business – you’ll typically spend up to 80% of your time cleaning data!

In this course you’re going to learn how to clean your data - in Excel, Python and R - in hours rather than days or weeks!

I’ll teach you a method that took me several years to perfect, and as a result you’ll become much more productive, get your results faster and make your boss happy in the process.

The steps you’ll learn in this course are very simple to follow, but are extremely effective, so you’ll know that you’re getting the best start possible, saving you weeks of misery!


Video lessons

Articles

Downloadable resources

Certificate of completion

Interactive experience

Perfect for beginners


  • Description
  • Content
  • Excel
  • Python & R
  • Outcomes
  • CertificatE
  • 14D2C2

I won't formally teach the concepts in Python or R, but I will point you in the right direction with strong code hints.

Why?

I could just hand you all the code to use, but that wouldn't aid your learning path and would only hinder your progress in the long run.

The best way for you to make the transition from Excel to Python/R is for me to:

  1. 1
    teach you the concepts (which are all simple)
  2. 2
    point you in the right direction (code hints)
  3. 3
    encourage you to create your own code (with access to a forum where you can learn from others)

So for every exercise you will be encouraged to write your own code to perform precisely the same data cleaning tasks in Python/R as you do in Excel.

Don't worry - there are already pre-built functions in Python and R that we can use. It's all rather straight-forward!

Your Curriculum

1

Introduction to Data Cleaning

An introduction to this course and to the 14 Day Data Cleaning Challenge!

Open Access

CHAPTER HIGHLIGHTS

Introduction to Data Cleaning

2

Removing Unwanted Spaces

In this chapter you'll how to remove all unwanted spaces from your data in one amazing move!

Open Access

Free Plan

CHAPTER HIGHLIGHTS

Anatomy of a Good Workbook

Remove Trailing & Leading Spaces Part 1

Remove Trailing & Leading Spaces Part 2

Standardising the Case of Text Entries

3

Cleaning Text Data

The focus in this chapter is on how to clean your text data in Excel accurately and efficiently

Premium

CHAPTER HIGHLIGHTS

Cleaning Text Data Using Remove Duplicates and Find & Replace

Cleaning Text Data Using Remove Duplicates and VLOOKUP

Cleaning Text Data Using Filter and Find & Replace

4

Cleaning Numerical Data

The focus in this chapter is on how to clean your numerical data in Excel accurately and efficiently

Premium

CHAPTER HIGHLIGHTS

Identifying Text in Your Numerical Data

5

Order, Order, We Must Have Order...

In this chapter you'll learn the precise order in which to apply all the techniques you've learned

Premium

CHAPTER HIGHLIGHTS

Order, Order, We Must Have Order…

6

Data Cleaning Recap

In this chapter you'll recap everything you've learnt and practice it all again!

Premium

CHAPTER HIGHLIGHTS

Data Cleaning Recap

How to analyse categorical survey data in Excel and in R
Chi-Squared Innovations