These courses that he creates are comprehensive and he puts a lot of work in to keeping his courses up to date.
Simply put, I think that the contents of the course make it worthy of the title, it is a true master-class.
It will give you solid programming fundamentals that will pay off in the long-term so I strongly recommend this course for anyone who thinks that this is right for them.
R for Data Science (Free)R for Data ScienceThis book will teach you how to do data science with R: You'll learn how to get your data into R, get it into the most…r4ds.
nzHadley Wickham is very prevalent in the data science community and is the author of “R for Data Science” which you can find for free.
I think it provides a great first start by showing how to use the tidyverse within R from the very start.
(You can find an outline in the tidytools-manifesto.
) The tidyverse is an assortment of packages for R that work together to aid the data exploration process by having functions that can perform complex transformations with relatively few lines of code.
If you aim to become an R-Master like my friend Karthikeyan P.
R on LinkedIn, then you would do well to look through this book.
If I ever need to really sharpen my R skills beyond the basics, this is where I would look!I should note again that I havent used this resource myself.
I learned R through Kirill Eremenko’s course on Udemy in addition to running through some of the code of An Introduction to Statistical Learning, which I will mention again in the future.
(Spoiler alert!)VisualisationVisualisation is the art of communicating insights derived from data in a way that makes important trends clear and easy to understand.
After learning some programming, visualisation with the ggplot2 package in R or the matplotlib/seaborn packages in Python is where most people go next.
Exploratory Data Analysis is a fun activity, which can help outline the patterns hidden within your data and is definitely a part of what a data scientist should be capable of doing.
It is also worth noting that outside of the R and Python visualisation packages, there are other suites such as Tableau, MS PowerBI and Qlikview which are often used for business intelligence, reporting and dashboard creation.
It can be well worth getting to grips with these programs as well if you want to specialize into this area further.
Storytelling with Data — BookBook – storytelling with dataStorytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal…www.
comThis book honestly needs no introduction.
If you are active in any sort of data science community then you will often come across this book incredibly quickly, not to mention that I have cited this book multiple times on this blog.
I have made my own review of this book here if you are interested.
In short, Knaflic uses a lot of the concepts that she outlines to teach you how to design your visualisations that they give only a piece of specific information at a time and they should be constructed with the aim of outlining a problem and using data to conclude with a call to action.
The techniques that you learn can also be used for designing a communication through any visual medium.
resumes) I cannot recommend this book enough.
Practical Tableau — BookPractical TableauWhether you have some experience with Tableau software or are just getting started, this manual goes beyond the basics…shop.
comOfcourse, you can’t have a section on data vis.
without giving some kind of introduction to Tableau.
Tableau is the main tool used for data visualisation and is something that (as of 2019–06–01) is something of a daily driver for me at work.
It’s a skill that is sought after and is underestimated in its own right, so it is absolutely worth picking up.
Ryan Sleeper is incredibly skilled in Tableau and his book Practical Tableau is not only a fantasic introduction, but also covers more advanced techniques that can help you to use Tableau to its fullest potential.
MathematicsIf one does not learn how to transform numbers using mathematics then an analyst is not making the most of the data that is given to them.
Plain and simple.
Although visualisation is a fun and easy play to start with data science, eventually it is required to take on some “heavier lifting”.
A lot of people become very apprehensive about math because they have had bad experiences in highschool and university.
When I went to university my positive experience with mathematics soured substantially as I was placed into a math course that was more advanced and faster paced than what I was ready for and with teachers who had zero idea about how to communicate their subject to students.
I expect that this is what happens to many students along their academic careers.
However, I have found that I could change my fate by finding amazing teachers online who allowed me to explore the topic at my own pace.
My curiosity and love for mathematics reignited again.
The Mandelbrot Set — An Example of a FractalDuring that brief period, I would describe math as some sort of confusing barrage of equations.
After learning online, I have realised that there is a fractal like pattern to mathematics where delving deep into one topic will not only reveal a finer structure but is completely connected.
If you think that math is “just not for you”, then give these teachers a chance to help turn things around.
Some examples of mathematics within data science are as follows:Vectorisation through linear algebra will allow you to transform lots of data at once.
Gradient descent and other optimisation problems that many machine learning techniques are based on requires an understanding of calculus.
Machine learning has strong roots within statistics, meaning that an understanding of how statistics works and the assumptions it makes will lead you to making better generalisations and knowing they do not apply.
This means that a strong mathematical foundation is required to become a good data scientist.
There’s simply no way around this and it is a bridge you will have to cross eventually in order to become successful.
(Note that statistics also has roots in linear algebra and calculus as well.
)Khan Academy — Website (Free)Khan AcademyLearn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…www.
orgSal Khan also needs no introduction.
I remember when Khan Academy was just a random youtube channel when I was doing my highschool exams.
The channel has since blossomed into a dedicated website where these lectures are structured for your viewing pleasure.
They are also working to make sure that each section has its own associated problems, which I value very highly, as you likely know by now.
Honestly, all the math you will likely need will be found here but there are a couple of extra resources that I would like to add.
Linear Algebra — MIT Opencourseware (Free)Linear AlgebraThis is a basic subject on matrix theory and linear algebra.
Emphasis is given to topics that will be useful in other…ocw.
eduGilbert Strang is nothing short of a saint.
Although Khan Academy does offer linear algebra lectures, I find that they are not nearly as well taught as by Gilbert Strang’s 18.
06 MIT course on the subject.
Although this is merely a string of youtube videos, if you take handwritten notes and try your best to work out the problems as Strang is approaching them then you can still learn quite a lot!.Afterwards, you will become much more comfortable with linear algebra which is useful if you want to take machine learning and computer science courses.
It is worth noting that other classes are given by MIT-OCW as well such as multivariate calculus (18.
I recommend taking from either MIT-OCW or Khan Academy courses as needed depending on your applications.
Try not to get lost on these websites.
Although it is good to have strong mathematical foundations, 80% of the benefits will be attained from 20% of the math concepts.
Machine LearningFinally, we get to the meat of data science, machine learning.
Hopefully you already have a good understanding of programming in both R and Python so that this is no longer an obstacle for you.
The same goes with the math that I have shown above.
Predictive analytics has become all the rage these days as our computational abilities have skyrocketed.
We use predictive analytics to help automate processes, saving on resources and increasing efficiency.
There are often times where machine learning can make predictions better than humans can when the right data is available.
This is certainly an interesting field.
Machine Learning A-Z — UdemyMachine Learning A-Z (Python & R in Data Science Course)Learn to create Machine Learning Algorithms in Python and R from two Data Science experts.
Code templates included.
comWe’re still not done with good old Kirill!.This was the second machine learning course I took online and it has been one of my favorite so far.
The good part about Machine Learning A-Z is that Eremenko teaches the concepts behind machine learning without needing to access all of the complicated mathematics behind it, favoring a more heuristic approach.
The other courses on this list require more mathematical understanding but I think that knowing what the algorithms are doing on a conceptual level before digging into the math can help clarify things a lot.
The course is done both in Python and R and gives you some “cookbook code recipes” for you to implement if you ever want to do your own machine learning projects.
You won’t be a master after this course, but it is a really good first step in my best estimation if you’re willing to spend some money.
An Introduction to Statistical Learning — Book (Free) and online course (Free)Statistical LearningLearn some of the main tools used in statistical modeling and data science.
We cover both traditional as well as…lagunita.
edu“An Introduction to Statistical Learning w/ examples in R” (ISLR) and “Elements of Statistical Learning” (ESL) are the most cited textbooks when it comes to machine learning bar none.
It is worth noting that ESL requires a firm understanding of statistics, linear algebra and calculus whereas a high school level of understanding is probably enough to understand ISLR.
Note that reading statistical notation can quickly become overwhelming.
I found that taking the associated lecture course on stanford lagunita helped me a lot to help digest the information that was displayed within the book.
After gaining a conceptual understanding from Kirll, this book will help a lot to understand the mathematical underpinnings of the algorithms without needing too much linear algebra.
Machine Learning — CourseraMachine Learning | CourseraMachine learning is the science of getting computers to act without being explicitly programmed.
In the past decade…www.
orgAndrew Ng is also a huge household name when it comes to machine learning and deep learning.
His associated courses on Coursera are a little more complicated since you are required to implement machine learning concepts from scratch (often using vectorisation) using Matlab/Octave.
Meaning that you should first have a good idea for how machine learning works as well as some linear algebra.
Make sure you apply for financial aid, as Coursera might allow you to take courses for free if you have the right reasons and the dedication to see a course through to the end.
Udacity NanodegreesFinally, I’d also like to have a word about the Machine Learning, Deep Learning, Artificial Intelligence and Data Science nanodegrees offered by Udacity.
I did the Data Analyst Nanodegree and I am taking the Data Scientist Nanodegree now because I enjoyed the former one so much.
(If you want to read about my experiences with the Data Analyst Nanodegree, see here.
) Basically, what I really like about the nanodegrees is the opportunity to apply what you learn, both in the classroom and in the projects that you need to do.
Personally, I’d recommend doing the machine learning and deep learning nanodegrees with Udacity and then learning about visualisation, programming and database management elsewhere.
This means that you’ll be able to get the most out of the nanodegrees.
The AI Nanodegree is a little too focused on AI in general.
Nanodegrees can be a little expensive at €1000 a pop.
Of course, there are plenty of other resources that I cited in this article.
Most of the value you get from Udacity is from projects, which you can do on your own anyway with Kaggle.
com or something similar.
Don’t feel compelled to do it, but if you are interested and committed then I would strongly consider it.
SQL and Data EngineeringSQL & Database Design A-Z — UdemyLearn MS SQL Server & PostgreSQL: Database Design A-Z™Learn Both SQL Server & PostgreSQL By Doing.
Enhance Your Data Analytics Career With Real World Data Science Exerciseswww.
comSQL is super important when it comes to learning data science.
I wrote a LinkedIn post some time ago about how it is important to know your SQL and it got many upvotes/likes.
Why?.Because most of the time, the relational database you will be working with will be in SQL.
Otherwise, even if you are using NO-SQL (i.
Not Only SQL) , languages such as Hive will mean that you will be using SQL anyway.
Languages like Pig, whose structures closely resemble SQL, will be much easier for you to learn and understand.
Once again, Krill Eremenko has been very prolific and created a SQL course with his younger brother.
It will not only teach you how to code with SQL using Postgres and MS SQL Server, but it will also teach you about normal-forms, joins and the the logic behind relational databases.
The problem with SQL is that it is not easy to get practice with it outside of a office setting.
However, Jonathan Ma of Joma Tech on YouTube.
outlines some other ways to get comfortable with SQL to prepare for interviews.
(Source 1, Source 2, Source 3.
)Hadoop/Spark/etc -BooksA word of warning, most companies don’t need big data and so learning this kind of skill set is usually not required.
I learned using Udemy’s “ultimate hands on” Hadoop and Spark courses from Frank Kane but I don’t recommend that you learn this way because you don’t do enough to actually have any credible skills in this domain with these courses.
Instead, Kane says that he bases his courses off of the O’Reilly books for each associated tool.
So it might be worth going through these books if you ever need to learn these tools instead:Hadoop: The Definitive Guide 4th Ed.
Learning SparkPodcastsOne of the most important things to do as a data scientist is to seek the expertise of those who are already well established in the field.
Podcasts are an amazing way to get up to date with all sorts of interesting topics and to understand the paths of data scientists who came before you.
Thus, listening to podcasts will be a crucial part of your education.
I did not grasp data science fully until I started listening in this way and doing so has allowed me to know “enough to be dangerous” on a broad range of subjects.
Data FramedSuperDataScienceData ScepticBecoming a Data ScientistConclusionThe advantage of taking the time to write a blog post like this is that I can put all of the relevant courses that I have come across in one place.
I anticipate that I will want to update this post over time as the learning landscape changes but this should be a good enough start for now.
As I said at the start, this is a lot to take in and it will take you significant amount of time to work through the resources on this list, especially if you are working full time.
Just focus on learning a little bit every day.
If you are working a 40-hour work week then 1–1.
5 hours a day during the week and maybe 3 hours on the weekend will probably be fine.
You don’t need to work on this for long hours, just make sure that you keep returning and you will make progress.
I wish you the best of luck on your data science journey.
If you have any questions then do not hesitate to reach out to me on LinkedIn or by email for further information.