This week, we launched a new video series: Machine Learning Q&A! Our first question: what is the best programming language for machine learning?
As he’ll do in every Q&A video, our founder Nick shares his point of view—based on real-world experience and practical application for beginners. Watch the full video above, or keep reading for a few key highlights.
What is the best programming language for machine learning?
Spoiler alert—it’s Python! Python is the Swiss Army knife of programming languages, useful for wrangling data, modeling, and actually productionalizing machine learning projects.
Languages like PHP aren’t meant for wrangling data, and languages like R aren’t friendly for hosting a productionalized model. They each have their strengths, but when it comes to the best language for machine learning, Python is a better fit for accomplishing the start-to-finish work.
That said, if you have a professor who wants to teach you R, learn from them! Using the perfect language from day one isn’t all that crucial, because most algorithms will be identical from one language to the next. Spend your time studying the fundamental concepts of machine learning. You can always learn the syntax of a new language later, but that foundational understanding of the discipline should really come first.
Ask your machine learning questions!
What machine learning questions are on your mind? Let us know, and we’ll address it in a future video! Shoot an email to email@example.com, send us a message on Twitter, or join the discussion in the comments on YouTube.
Full video transcript
0:01 Hi! Today’s question is: what’s the best language to use for machine learning? I’m sure you’ve heard a lot of languages thrown out there, such as R, Python, Scala. If you give me a few minutes, I’ll tell you why I believe Python is the best language for you to start with.
0:24 I started my machine learning career in 2015, trying to solve rate optimization problems for vacation rentals. I came from a traditional PHP web development background, and to be honest, I did try to use that to start with. There was a lot of promise in 2015 about being able to outsource machine learning to Amazon’s APIs that will do all the magic work for you and spit out predictions.
0:51 And I followed that path for a couple of months until I realized that machine learning is so much more difficult than something that you can outsource to a company to have just automated for any type of problem.
1:05 And what I also realized was a lot of machine learning has to do with wrangling the data before you try to fit any type of modeling on top of it. And some languages, like PHP, just really aren’t meant for managing data like that.
1:23 The cool thing that I ran into was the open-source community in Python. There’s a lot of very smart people—engineers and scientists—that have spent thousands or tens of thousands of hours putting together some open-source libraries for you to take advantage of.
1:42 There’s a lot of other languages out there that you might run into. I hear R a ton. I’ve had a lot of coworkers with R experience that are really good at modeling. I think the problem that I’ve seen in my professional career is R is not necessarily the most friendly language for hosting a productionalized model.
2:08 It might be a really good language as a test bench to play with different models or data, but when you need something behind a web app, let’s say, that’s scalable, or you wanted to deploy onto a raspberry pi or some microcontroller, you’re going to find it much, much easier to use a language like Python—just because it’s meant for that.
2:38 It’s meant to do a lot of things. It’s basically the Swiss army knife of programming languages. And so I’d definitely recommend it as a starting point.
2:47 If someone’s going to teach you R, like your professor—all day long, go learn from them. A lot of the machine learning algorithms are going to be almost identical between the languages, and so if you can learn it in one, it’s going to apply well to other languages.
3:02 So it’s more important, in my opinion, to get down the machine learning concepts and less what language you use to do that in. Because you can always pick up another language and learn its syntax, but it’s really hard to get those machine learning concepts to start with. So that’s what you should be spending most of your time with.