If you're just getting started in Data Science and you're looking for a little guidance on how to go about it, we've put together a list of the best books to learn Data Science.
There are loads of great books to get started in Data Science, but where should you start and which ones should you read first?
The post will help you make those decisions, and you can think of it as a Data Science starter kit.
To start with, I'm going to lay out the 7 steps you'll need to become a Data Scientist.
You won't want or need to follow all those steps right from the beginning, there are too many of them, so I'll give you a getting started in Data Science roadmap - the 3 steps to get started in Data Science.
Then I'll give you my pick of the 3 best books to learn Data Science to follow that roadmap.
By the time you've got a working knowledge of each of these areas your Data Science skills will be the envy of your colleagues!
Disclosure: we may earn an affiliate commission for purchases you make when using the links to products on this page. As an Amazon Affiliate we earn from qualifying purchases.
Getting Started in Data Science
Want to be an expert statistician? Great - enjoy the next 10 years of hard work!
What about machine learning? Do you want to be an expert in that? Or what about programming or data visualisation? That'll be 10 years (or more) for each.
Whether you're just getting started in Data Science or you're already on the path to becoming an expert you're going to need a strong grounding in all of these disciplines. It's safe to say that it's going to take you at least 20 years or more before you can consider yourself to be experienced, proficient and an authority in Data Science.
It can be daunting, especially if you're just getting started in Data Science.
You're going to have a lot of questions, including:
In this blog post we're going to introduce you to the 7 steps on your journey to becoming an expert Data Scientist - this is your Data Science starter kit.
This post is the 1st in a series of 8 in which we bring you 21 of the most inspiring books to help you get started in Data Science - our top 3 picks of the best books to learn Data Science for each of the 7 steps on your roadmap.
These books are categorised by general Data Science, statistics, Python, R, Machine Learning, data visualisation and data ethics so that you can choose books from each category and you don't miss a thing!
7 Steps in Your Data Science Starter Kit
Over the past few years I've been asked all of the questions above, and I've always tried to steer the questioner in the right direction.
This has led me to discover what I consider to be the best books to learn Data Science, so I've decided to pull them together into one place for you to check out.
There are 7 steps to a successful career in Data Science:
- 1General Data Science
- 4R Programming
- 5Machine Learning
- 6Data Visualisation
- 7Data Ethics
In this series of posts I've selected my top 3 books in each of these categories, and while you would benefit from reading all of them, one from each category would be great.
Having said that, getting through a list of 7 books is a tall order in itself, so where should you begin when you're just getting started in Data Science?
21 Inspiring Books to Get Started in Data Science @chi2innovations #datascience
3 Steps to Getting Started in Data Science
When it all comes down to it, you need lots of skills to a Data Scientist.
But which ones should you acquire first?
Above all, there are 3 skills that will set you up for a life as a Data Scientist. You need to able to:
- 1Analyse data to find patterns and trends that describe to us what the world was like when the data were collected
- 2Create models that can spot patterns and trends that can predict what the world will be like in the future
- 3Tell the story of the data using inspiring graphs and charts
So which branches of Data Science are these?
- 2Machine Learning
- 3Data Visualisation
These are where you should get started in Data Science.
Top 3: Best Data Science Books for Beginners
So now that we've whittled down our Data Science starter kit to the first 3 areas, let's have a look at my pick of the top Data Science books in each of these 3 categories.
Best Books to Learn Data Science: Statistics
Our top recommendation is one of the best selling Data Science books online, and is called Practical Statistics for Data Scientists, by Peter and Andrew Bruce.
In this second edition, it has now been updated to include practical examples in Python as well as in R.
This is great, because you get to learn how to program at the same time as learning stats.
There's nothing wrong with learning to program in R, but I recommend that you choose Python. R is mainly used within academia, while Python is mostly used everywhere else.
Besides, choosing Python will be obvious when you see my next recommendation of the best data science books for beginners...
Practical Statistics for Data Scientists
50+ Essential Concepts Using R and Python
Peter Bruce and Andrew Bruce
The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what’s important and what’s not...
Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what’s important and what’s not.
Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.
With this book, you’ll learn:
Best Books to Learn Data Science: Machine Learning
Our next recommendation is also one of the best selling Data Science books online, and is called Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, by Aurélien Géron.
Now in its second edition, this book has been updated to include TensorFlow 2, an open-source Python library developed by the Google Brain team for Machine Learning and Deep Learning.
Hands-On Machine Learning includes Scikit-Learn, a Machine Learning library for Python featuring classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and, oh, lots more great stuff.
Keras, also for Python, is an artificial neural network library.
This is one of the top data science books and will help you get to grips with lots of different Machine Learning techniques.
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow
Concepts, Tools, and Techniques to Build Intelligent Systems
By using concrete examples, minimal theory, and production-ready Python frameworks, author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems...
Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.
By using concrete examples, minimal theory, and two production-ready Python frameworks – scikit-learn and TensorFlow – author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks. With exercises in each chapter to help you apply what you’ve learned, all you need is programming experience to get started.
Best Books to Learn Data Science: Data Visualisation
Our next recommendation is one of those must read Data Science books, and is called Storytelling with Data, by Cole Nussbaumer Knaflic.
As the title suggests, the focus of this book is on using graphs and plots to tell the story of your data.
This is one of the best selling data science books online, and I heartily recommend you add it to your bookshelf.
Don’t simply show your data – tell a story with it!
Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You’ll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples – ready for immediate application to your next graph or presentation.
Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don’t make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story.
Specifically, you’ll learn how to:
Together, the lessons in this book will help you turn your data into high impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data – Storytelling with Data will give you the skills and power to tell it!
Best Data Science Books for Beginners - Summary
I hope our top 3 of the best Data Science Books for beginners will help you in your task of getting started in Data Science.
This is only the beginning, though.
I have chosen the top Data Science books in each of the 7 steps in your Data Science starter kit, and written a blog post about each of them
Ready to see the rest of them? Then choose your medicine from the navigation element below.