September 15

Data Collection Techniques – 11 Essential Tips

Discover Data, Blog


This blog post will deal with data collection techniques and good data management practices in Excel, and will give you some great tips on how to get organised and avoid making the kind of mistakes that will make you prematurely grey!

Over the next couple of weeks I will deal with other important issues such as data cleaning and data coding and classification, but for the moment let’s focus on the most important – your data collection techniques. Mess this one up and there’s no point moving on to the others, so focus guys!


11 Tips for Data Collection

It doesn’t matter whether you’re a scientist or an entrepreneur, in academia or in business, if you’re collecting data to try to answer some questions then you need to understand the fundamentals.

You’ll likely spend a lot of time observing, measuring, counting, classifying and quantifying what you see, and once you’ve collected your data you’re going to have to analyse it.

But let’s not get too far ahead of ourselves.

Before you can get any answers you’re going to have to:

  • 1
  • 2
    Record & Store
  • 3
    Clean & Classify

The textbooks tend not to dwell on the practical issues of these steps because, well, to be honest, it can get quite messy, but these are vitally important steps and you really do need to know how to do them properly if you’re going to get the most out of your data.

So let’s rewind to the beginning and see what we can do to get you off to a good start...

Here are 3 rules to start off with. Repeat after me:

  • 1
    Don’t Panic !!!
  • 2
    Start thinking about your data before you start collecting
  • 3
    Make a personal vow to understand the basics of data


I've put together an infographic on the most important 11 data collection techniques that might help you a little:

11 Essential Tips for Effective Data Collection

Data Collection Technique #1: Record Data on Paper First…

So you’ve got your hypothesis (theory, idea or hunch). Once you’ve decided what data you need to collect, the first thing you should do is design a paper-based form to store all your data (assuming that at least some of your data is going to be recorded by hand).

Keep it simple, print it out, then manually record your data with pen and paper. One form per case/patient/customer/test-tube, etc..

Data Collection Technique #1: Record Data on Paper First

Data Collection Technique #2: Then Transfer it to an Electronic Medium

We may be living in an electronic world, but ultimately you need a system where you (or anyone else) can follow the data trail from beginning to end and – more crucially – from end to beginning.

From time to time you WILL make a mistake with the data, so it is vitally important that you design a method that will let you spot and rectify the mistake by going back through all the steps until you find the error.

So now you have your data recorded on paper you need to transfer it into an electronic system. More than likely this will be either Microsoft Excel or Access.

In general, Excel is more common and easier to use, and has the added advantage that you can manipulate the data and do some simple analyses right there without having to export your data.

Most data is stored in Excel (in 7 years as a medical statistician I was only once given data in Access – all the other times it was in Excel), so we’ll go with that from here on in…

Data Collection Technique #2: Then Transfer it to an Electronic Medium

Data Collection Technique #3: Enter Your Data on a Single Worksheet Whenever Possible

Trying to sort your data when it is spread across multiple worksheets can lead to all sorts of problems, so try to avoid it whenever you can – keep your data on a single worksheet.

Excel 2003 has limits of 65,536 rows by 256 columns. That’s large enough for most datasets, but if you need higher limits you can use Excel 2010 or 2013 (1,048,576 rows by 16,384 columns).

Practical Data Cleaning

Practical Data Cleaning

Practical Data Cleaning explains the 19 most important tips about data cleaning to get your data analysis-ready in double quick time.

Discover how to clean your data quickly and effectively. Get this book, TODAY!

Data Collection Technique #4: Use a Unique ID column

You’ll likely have to sort your data many times and by different columns, so you’re going to need a way of restoring the original order.

Use column A as a unique identifier to insert consecutive numbers starting from 1. It may be simple, but it’s very effective.

When you’ve put your Unique IDs into column A, go back to your original paper sheets and write the Unique ID there as well. Trust me – you’ll thank me for this tip later…

Data Collection Technique #4: Use a Unique ID column

Data Collection Technique #5: One Column per Variable

Each variable should have… oh, hold on a minute, what’s a variable?

Well, simply put, these are the things that can change or can be changed as part of your study. In short, these are all the pieces of information that you are observing, measuring, counting and collecting, like age, gender, distance, temperature, etc..

You can find more information in my blog on different data types.

Where were we? Ah yes…

Each variable should have its own column, and each variable should correspond to just one piece of information.

Data Collection Technique #5: One Column per Variable

If you’re entering the age of a patient, then just enter their age, don’t enter their date of birth in the same column or cell. If you want to record their age and DOB, then use 2 separate columns.

If you’re recording a composite variable made up of 2 or more constituent parts, like Body Mass Index – made up of Height and Weight – then record them in separate columns. You can always combine them into a single variable later.

Data Collection "mess this up and your dataset is going to the great Data Graveyard In The Sky!" @eelrekab #gooddata

Click to Tweet

Data Collection Technique #6: Row 1 is the Variable Name

Eventually you’ll need to analyse your data and you may need to export it to a statistical program.

The standard for pretty much all commercial stats programs is for the first row to be reserved for the name of the variable and all other rows for the data. So don’t be tempted to use rows 2, 3 and 4 as well as row 1 for the variable name. It might keep everything looking nice and tidy in Excel, but it will only create more work for you later.

Data Collection Technique #6: Row 1 is the Variable Name

Data Collection Technique #7: Every Cell Should Have Something In It

What do empty cells tell you?

  • waiting for more information?
  • data not recorded?
  • original data incorrect?

An empty cell is just a great big question mark and tells you nothing.

Worse still, incomplete datasets give reviewers a reason to whack you about the head with a metaphorical stick (and believe me they will – I’ve been there many times…).

So make sure that something is entered in every cell.

It is quite common to use ‘illegal’ numbers as codes to give you information, so where the entries for a variable can only be positive values (like age or height), we can use codes such as:

Data Collection Technique #7: Every Cell Should Have Something In It

If negative numbers aren’t useful, then use letters a, b, c, etc..

If you’re not comfortable entering something in cells that strictly shouldn’t be there (after all, you are going to have to clean them up later before you can analyse your data), then use Excel’s Comment feature. I tend to use this sparingly, but that’s just me.

Data Collection Technique #8: Keep Great Notes

When using codes you’ll need to keep notes to tell you what the codes mean. Keep the codes and notes in a different spreadsheet.

Data Collection Technique #8: Keep Great Codes

While we’re on the subject, it’s really important to


You’re likely not the only person that will ever work with this dataset, so get used to writing stuff down.

Explain what the project is all about, the question you’re trying to answer, why you’re collecting this data and how you’re going to get the answers you’re looking for. Explain how you measured things and under what conditions. If more than one person is collecting data, then explain who, what, where, when, why and how.

Data Collection Technique #8: Keep Great Notes

This will be the document that explains all the important stuff about your dataset, so write it down.

If there’s too much information to comfortably put into an Excel spreadsheet, then a Microsoft Word document will be just fine – and keep it in the same folder as the dataset.

Data Collection Technique #9: Be Consistent With Data Entry

There’s nothing worse than getting a dataset that takes a fortnight to clean because data entry has not been consistent.

By that I mean make sure that if the entry for a variable should be ‘Positive’, then make it ‘Positive’ and not some other variation:

Data Collection Technique #9: Be Consistent With Data Entry

It’s hard enough correcting speeling missteakes and typos without also having to correct things that were deliberately entered differently.

Restrict the number of people that can enter data to cut down on this kind of issue, and make it clear what your data entry standards are.

Data Collection Technique #10: Don’t Guess

Data should be entered as accurately as possible.

Don’t guess, approximate, round up or down.

Enter the value exactly as registered on paper.

If you need the data to be rounded up or down you can use Excel’s functions to achieve this, but if you’re doing calculations in your head, on paper or in a calculator you’ll make mistakes which can be difficult – if not impossible – to spot later.

Data Collection Technique #10: Don’t Guess

Data Collection Technique #11: Zero is a Real Number

Don’t enter the number Zero into a cell unless what has been measured, counted or calculated results in the answer Zero.

I’ve often received datasets with lots of zeros and when I asked, the zeros meant ‘I don’t have data for this’.

The problem is that if you want to calculate something, like the mean, then all the zeros will be used in the calculation and you will get an inaccurate answer – or one that is just plain wrong!

Data Collection Technique #11: Zero is a Real Number

Your Next Step

Let's face it - you didn't visit this blog post because you're passionate about learning the latest trending data collection techniques, did you?

You didn't leave college or University saying "now that I'm free to forge my own path, I'm going to be a ... a data collectionist!"

That sounds like a line straight out of a Monty Python sketch!

No, you're here because you need to collect data, or you've collected data and want to know what the next steps are. Well, we're here to help.

I hope you've found this blog post useful, and to help you take your career to the next level we've created a series of video courses dedicated to teaching you the best data collection techniques, effective data cleaning and great data preparation to get your data analysis-ready in double quick time.

You can get the first of these courses right here, which includes all of the essential data collection techniques you're going to need:


​Beginner’s Guide To Effective Data Collection

This blog post will deal with data collection methods and data management and will give you some great tips on how to get organised, how to get started and how to avoid making the kind of mistakes that will make you prematurely grey! #datacollection #dtamanagement #datatips

The textbooks tend not to dwell on the practical issue of data collection. Here you have the 11 vitally important steps you need to take for effective data collection. #datatips #datacollection #data #datamanagement


data collection, data tips

You may also like

3 Types of Neural Networks Explained

How to Learn Python for Data Science

The Best Statistics Courses Online – Our Picks

The Best Data Science Courses Online

Awesome Gifts for Data Scientists

Computational Statistics is the New Holy Grail – Experts

Data Ethics – Top 3 Books for Data Scientists

3 Essential Python Books for Data Science

Top 3 Books on Statistics for Data Science

3 Crucial Tips for Data Processing and Analysis

Free Must-Read Statistics Books for Aspiring Data Scientists

Correlation Is Not Causation – Pirates Prove It!

Fantastic Free Data Science Books for Aspiring Data Scientists

21 Inspiring Books to Get Started in Data Science

The Top 3 Data Plotting Courses at Udemy

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
__CONFIG_local_colors__{"colors":{"4ceba":"Royal Blue"},"gradients":{}}__CONFIG_local_colors__

Data Cleaning - The Big Picture

FREE Ultra HD pdf

Download your FREE mind map to learn the secrets to effortless data cleaning.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b8c9c":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b8c9c":{"val":"rgb(255, 255, 255)"}},"gradients":[]},"original":{"colors":{"b8c9c":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Get it Now !!!
Do NOT follow this link or you will be banned from the site!