How to Set Up Your Python Environment

Update 25. November 2021: This guide is outdated. Nonetheless, the described way to set up a Python environment is still a perfectly valid way to configure your Python environment. You won’t do anything wrong if you follow this guide. It’s just that it no longer reflects how I myself set up my Python environment. You can expect a substantial revision of this article. Consider following me on Twitter to get notified when I publish the revision. The gist of the coming revision is that I stopped using conda and now use pyenv together with pyenv-virtualenv and poetry. The crucial detail is to run poetry config virtualenvs.create false after installing Poetry so that I’ll use it only to define the package but not to manage the virtual environments, which shall be done by pyenv-virtualenv instead.

In this post I’m going to explain how to set up a working Python environment for scientific computing. That includes NumPy, SciPy, and Matplotlib but also some more advanced tools and even a framework for Machine Intelligence and Deep Learning called TensorFlow. This is a long article, but you don’t need to read the entire thing. Most of this article is only either optional, historic background, or reasons why we do what we do. If you’re not interested in all that and only want to know what to do to get Python up and running, then the relevant section is quite short. Let’s begin.

Update 02. February 2019: This guide should now be applicable for Windows users too, thanks to the release of Homebrew 2.0.0 which added support for Linux and Windows 10. However, this article was written with macOS in mind, so I won’t be able to help you out with problems regarding Windows, since I do not use Windows.

Before we get started, I want to briefly give you some background so that you actually understand what you are doing and why you are doing it. Personally, I always prefer explanations that shed light on the Why over articles that don’t provide any context at all and merely expect me to copy some code snippets from that random website I found on the internet without explaining what the code actually does. (What makes it even worse is that, in many cases, the code is wrong or suboptimal). I enjoy being able to fix the problem in case something doesn’t work on my specific machine (because I then understand what’s causing it) rather than having to look for a different solution. If you don’t care about the additional information and only want to know how to get Python up and running as quickly as possible, you can skip the following section and jump directly to the relevant part.

Many Roads Lead to Rome

There are multiple different ways to set up a working Python environment. The most convenient one is to install a Python distribution such as Anaconda or Enthought Canopy. This will install not only the Python interpreter itself but also over 150 of the most popular packages for scientific computing. With this setup you are very well positioned for your upcoming Data Science projects. One point of criticism though is that—while this approach is very beginner-friendly—you might never need most of these packages, especially if you’re only dipping your toes in the water. Installing this plethora of packages you’ll never use will only (unnecessarily) occupy disk space. However, we’re talking about only 300 MB, which is neglectable in my eyes. (Update 2020: Anaconda now installs 1500 rather than just 150 packages and thus takes up 3 GB instead of mere 300 MB. If you can’t spare that much disk space, I recommend installing Miniconda instead of Anaconda.) For a beginner, a set-it-and-forget-it approach where you sacrifice ~~300 MB~~ 3 GB is much more sensible than to cheap out because of ~~300 MB~~ 3 GB and then run into problems because you don’t have certain packages installed. If that “waste” of disk space bothers you nevertheless and you’d rather have more control over what will be installed and what not, there are alternative approaches.

Since Anaconda is around only since 2012 and Enthought Canopy only since 2013, some folks are more familiar with (and still endorsing) the traditional approach. The traditional approach is to use pip (an installer for Python packages) to individually install exactly those packages you actually need (and nothing more). To get the Python interpreter and pip in the first place, you would also use a tool called Homebrew in combination with pip. While this approach was the way to go before those newer distributions existed, I don’t recommend following the pip route. Here’s why. The creator of Anaconda, Travis Oliphant, has been involved in the Python world since 1998. He is also the creator of NumPy, SciPy, and many, many more Python packages for scientific computing. At some point he became unhappy with pip and wrote his own package manager named conda. If you read his story why he did it, you will come to accept that conda is so much better than pip. But humans are creatures of habit. pip is what they’ve always used, what they’re familiar with … because up until 2012 anything else simply didn’t exist. I guess that’s why so many people still recommend the old-fashioned pip approach even though there are better alternatives now. Old habits die hard.

Advantages of Anaconda

Using Anaconda, or rather conda, has many many advantages over the pip approach. Aside from its beginner-friendly installation, one of its major advantages is its easy code reproducibility. Anaconda lets you easily share your environment with others, for example, your co-workers or an advisor. This means that you can make your exact same environment available to other researchers, i.e., they don’t have to install the required dependencies and configure the environment themselves—they just adopt yours! Never encounter the awkward “Well, it worked on my machine …” again. Imagine reading a paper or attending some conference or meetup and being able to instantaneously try out the code on your own machine, simply by adopting the author’s/speaker’s environment rather than having to investigate yourself which dependencies you need to install before you can get started.

To manage said dependencies, Anaconda comes with a package manager called conda. In contrast to pip this package manager is language-agnostic, meaning that it works not only for Python packages but for all sorts of languages, for example, R, Go, Lua, Julia, or Scala. This can be useful if a colleague is using a different language than you are. Even if you’re working alone but want to use multiple languages within the same project, conda can help you.

Anaconda’s package manager is also better at packaging applications along with their required libraries. This might not be relevant to you right now, but it will be as soon as you become more experienced and want to publish your data together with your code so that others may verify your findings and build upon them.

In case you really want to know why you should use conda rather than pip, listen to the CEO and co-founder of Anaconda, Peter Wang, explain it:

Introducing Miniconda

In case you’re arguing that you don’t need all the packages that come with Anaconda and don’t have 3 GB of disk space to spare, there’s a lightweight variant of Anaconda called Miniconda which installs only the essential packages but not the slew of packages that come with the full distribution of Anaconda. Check out this comparison of Anaconda and Miniconda if you are unsure which one is better suited to you. Don’t be fooled by its name though. Miniconda is not a “little brother” in the sense of being less powerful than Anaconda. It has all the benefits of Anaconda yet installs only the most fundamental packages (hence the name), namely conda, its dependencies and the Python interpreter itself. Miniconda leaves you the choice which packages you actually want to install and thus combines the best of the two different approaches.

The downside of installing your packages manually is that you need to educate yourself which packages you’re going to need to install for your project. Imagine a package manager like an app store without a graphical user interface. Unlike the app store on your smartphone, there are no curated categories or “Best Of” lists you can browse to explore new apps. To find and download a package, you first need to have heard about it and know its name before you can—just like on your smartphone—type the name into the search bar of the app store. For beginners who don’t even know where to start this can be quite intimidating if they have no idea what to search for. That’s why we’re sticking to Anaconda rather than Miniconda.

Installing Homebrew

Either way, the very first thing you need to do is to install Homebrew if you haven’t done so already. In case you don’t know what Homebrew is or how to install it, I covered Homebrew in-depth in a previous blog post. It would be really helpful if you read that article first, since Homebrew comes in very handy for almost any software you’ll need in your developer career and you can achieve many cool things with it if you understand how to use it.

The Preferred Approach Using Anaconda

If you want to set up your Python environment via Anaconda like I did, you can get Anaconda as follows (after you’ve installed Homebrew):

brew install --cask anaconda
echo 'export PATH="/usr/local/anaconda3/bin:$PATH"' >> ~/.bash_profile
source ~/.bash_profile
conda update --all

That’s it! Now you have a working Python environment. It really is that simple.

Note: Bear in mind that executing these commands will take a while. You’re installing about 150 different packages, so give it some time.

Troubleshooting

In case the installation didn’t work on your machine or you’re just curious about these ~/.bash_profile, ~/.profile, and ~/.bashrc files, I highly recommend checking out my other blog post covering these so-called dotfiles. In that post I explain which of these files is the best to use in the code snippet above. You should read that post in case you’re having trouble installing Python, since ~/.profile will not be read if there is a ~/.bash_profile file (unless you explicitly source it).

Installing Additional Packages

The command to install more packages, for example, TensorFlow, looks like this:

conda install tensorflow

TensorFlow is a library for Machine Intelligence. With TensorFlow you can implement popular machine learning algorithms, specifically deep learning algorithms. It was developed by the Google Brain Team (which was founded by the highly renowned Andrew Ng) and then open-sourced. TensorFlow has become the leader in the space and quickly overtook Scikit-learn in popularity—another tool collection for machine learning. One of the primary reasons is that Scikit-learn lacks GPU support.

Installing Remaining Anaconda Packages

In case you installed Miniconda but changed your mind and now want the full Anaconda distribution, you can retroactively install all the packages that come with Anaconda:

conda update conda
conda install anaconda
conda update --all

First, we update the package manager itself. You shouldn’t use an outdated version of conda to install packages. Then, we install all the packages that make up Anaconda. Finally, it always is a good idea to update your installed packages, especially after you’ve made big changes (such as installing about 150 packages). That third command to keep your environment up to date is also a command you should run regularly.

You can check out this cheat sheet to see all other conda commands.

Channels

The packages on Anaconda’s default channel are not always as up to date as possible, so it’s a good idea to tell Anaconda to also search at different places by adding additional channels. You have to do this only once. The package manager will from then on automatically choose the channel that has the newest version of the requested package. Add the conda-forge channel, an entirely community-led channel, like so:

conda config --add channels conda-forge

If you still can’t find the package you’re looking for (i.e., it’s not available on any channel), you can always fall back to using pip. Remember, pip is the Python-specific package manager I was talking about earlier. Since pip is included in the Python distribution installed by Miniconda/Anaconda, you can always fall back on pip to install packages. We haven’t used it so far, since we prefer conda, but in case a package is not (yet) available via conda, you may use pip to install it.

Conclusion of the Anaconda Approach

By now, you’ve set up a full-fledged Python environment and learned the very basics of conda. Still, I recommend you to stay tuned and read the alternative approach using Homebrew and pip. As you saw, you can use pip in conjunction with conda, and a lot of pip’s principles apply to conda too (for example, virtual environments). In case you’re not interested in getting to know your tools better, you may jump directly to the final words of this article.

The Traditional Approach Using Pip and Homebrew

In case you don’t want to use conda (i.e., neither Anaconda nor Miniconda) but pip exclusively, the tutorial for you begins here. If you already installed Python following the Anaconda approach, you do not need the execute the code in the subsequent sections. In that case, the rest of this guide would be purely informational.

Installing Python and Pip

With this approach, we do not simply accept what some distribution serves us. In exchange, we ourselves need to specify which version of Python we want to install. There are two different versions of Python: Python 2 and Python 3. You should definitely use Python 3.

The only reason Python 2 is still around is because of compatibility reasons. It really shouldn’t be used anymore, because it’s that outdated. It’s end of life was originally scheduled for 2015 and developers were given enough time to make their software compatible with Python 3 by that date. Most developers did, but some didn’t. So, because of a few old projects support had to be extended until 2020. You really shouldn’t contribute to creating new software that is incompatible with Python 3 by using the outdated Python 2 for new projects. If you genuinely have the need for it, you can of course install both versions side by side. But I guess every relevant software complies with Python 3 by now, so there’s no reason for Python 2 anymore.

To extend the capabilities of Python and to add some cool new features to the core functionality of Python, you can install additional so-called packages. Not all packages are available in Homebrew, so we need another package manager specifically for Python packages. There are several of such package managers available (you already know conda and pip, but there are more, even older ones such as easy_install). The default installer for Python packages is pip. It has been around since 2008. Since 2014, more specifically since Python 2.7.9 and Python 3.4.0, respectively, pip conveniently comes bundled with the standard Python distribution. This means that if you install Python as we will in a minute, you’ll already have pip installed and don’t need to install it additionally.

As you may know, Apple pre-installed Python on your system by default. Now, you might be tempted to just use their version and start hacking right away. If so, let me stop you right here. Definitely do NOT use Apple’s pre-installed version of Python. Instead, install your own version of Python. This has several reasons:

Apple’s Python distribution (which differs from the standard distribution) does not include pip.
Apple’s Python distribution is outdated.
With macOS Catalina, Apple will remove its pre-installed Python distribution (and other scripting language runtimes)
Apple itself recommends installing your own version, since System Integrity Protection which was introduced in macOS El Capitan now makes it difficult to work with that system-provided version of Python.
Upgrading macOS, say from El Capitan to Sierra, can wipe the packages we’re going to install, forcing you to re-install everything after the next macOS update.
This site and similar ones give even more reasons. But they seem to just copy from one another and no one ever provides sources, so I didn’t fact-check their arguments.

Hopefully, these arguments convinced you to not use the system-provided distribution of Python but rather install the standard distribution. Install it like so:

brew install python3

That’s it! Now you have Python 3 as well as pip installed.

Important Notes

Should you ever see something like sudo easy_install pip—don’t do it! This command is a leftover from earlier versions of Python. As I mentioned, even though pip has been around since 2008, it was integrated in the Python distribution only in 2014. So before pip came bundled with Python, you had to install it yourself. Until pip took over, the package manager du jour (pardon my French) was called easy_install. However, pip simply was the better package manager and quickly became much more popular than it’s predecessor easy_install. Thus people used their current package manager to install the newer package manager. Integrating pip in the Python distribution made easy_install completely obsolete. You don’t need this command anymore.

Another thing you still may stumble upon quite often is the PYTHONPATH variable. People keep telling you to add it to your ~/.bash_profile file (see my blog post about these so-called dotfiles). This also is definitely not necessary anymore and you shouldn’t follow their “advice”. If you did set the PYTHONPATH variable already, you should remove it. The PYTHONPATH is a relict of the past when it was used for switching between different Python installations as well as for importing Python modules. But that’s exactly what Homebrew is doing for us, so there’s no reason left at all to set PYTHONPATH.¹

Furthermore, you should NEVER use pip in combination with sudo. This has always been a bad practice for several reasons:

The packages pre-installed by Apple with the old Python interpreter are located at /System/Library/Frameworks/Python.framework/Versions/2.7/. The packages installed by the user are located at /Library/Python/2.7/site-packages. Under certain special circumstances, pip messed up and confused these two locations when used with Apple’s distribution and sudo.
When using the Python distribution from Homebrew as you should, pip installs its packages to /usr/local/.../ (for example, /usr/local/lib/python3.5/site-packages) which is a safe place to write into and therefore doesn’t need sudo permission. So when sudo permission isn’t even necessary, then what’s the point of using it anyway? Installing packages with root permission isn’t a particularly good idea, even if you downloaded them from a trustworthy source. They could either accidentally or—even worse—purposely mess up your system, thus leading to a defective or unreliable Python environment. Using sudo in this scenario, you’ll only risk to accidentally cause problems which totally could’ve been avoided.
In case you did use Apple’s Python distribution, your packages were installed to /Library/Python/2.7/site-packages. With /Library being a system folder, you actually would have needed sudo permission to write into it. However, you never should’ve used Apple’s Python distribution in the first place. And if you don’t use Apple’s distribution but your own, you—again—don’t need sudo.
Let’s ignore the fact that you never should’ve used Apple’s distribution in the first place. Even if you insisted on using Apple’s distribution, instead of installing the packages globally to /Library and messing up the system’s Python framework, you should’ve rather installed your packages in a virtual environment. This wouldn’t have required sudo either.
Since macOS 10.11 (aka El Capitan) introduced a new security feature called System Integrity Protection it is no longer possible to write into the system folder /System/Library/.../ anyway, not even with sudo.² In this scenario, using sudo simply wouldn’t have any effect, since it is not allowed to write into this very special folder. Thus, you—again—don’t need sudo.

Testing OpenSSL

Along with Python comes the newest version of OpenSSL so that Python can be compiled against that version of OpenSSL instead of the system-provided one. The pre-installed version of OpenSSL shouldn’t be used for similar reasons as above. To check whether the right version of OpenSSL is being used, perform this command:

python3 -c "import ssl; print(ssl.OPENSSL_VERSION)"

The resulting output on your terminal should present you a reasonably current version of OpenSSL newer than version 1.0.2 (anything else is out of support).

Updating Pip

You know that you can keep tools installed via Homebrew (for example, Python 3) up to date with the command brew upgrade. However, since pip is not included in the standard library you need to update pip and some other things (the packaging tools) independently from Python. pip is not covered by Homebrew.

That’s why you need to perform this command regularly:

pip3 install --upgrade pip setuptools wheel

Because we’re using Python 3, you need to type pip3 instead of pip.

Learning to Work With Virtual Environments (Optional)

If you’re only working on one project at a time—which is what you’re probably doing in the beginning—this step is totally optional. As a beginner, you really don’t need a virtual environment for scientific computing. It’s perfectly fine to skip this step and revisit it once you’re working with multiple projects simultaneously.

Update: With Python 3.3, a subset of virtualenv (the module I’m describing) has been integrated into the standard library under the venv module. What this means is that if this core functionality is all you need, then you don’t have to install anything additionally. Simply use venv. It’s included in Python and you already have it. But if you want some of the features that virtualenv adds on top, you have to install them yourself. To help you decide whether you want to use venv or virtualenv, this site may be useful. My guide describes virtualenv. Not necessarily because I find it better than venv, but because I didn’t know of venv when I was writing this guide and didn’t see the need to rewrite it after I learned of venv when virtualenv is still a valid and popular option.

With virtualenv you can manage your packages per project rather than globally. For example: one of your web development projects could need the latest version of Django, while another project relies on a very specific, older version of Django for compatibility reasons. By using virtualenv you can separate installed packages and even different versions of packages from each other. Since Python’s own internal package dependency system is very complicated and not particularly easy to understand, virtualenv is a huge simplification and may become very important. With virtualenvwrapper you can make working with virtualenv even easier, since it sets smart defaults and aliases for frequently used commands.

Install the packages like so:

pip3 install virtualenv virtualenvwrapper

Next create a new folder in your home directory where you’re going to store all your Python projects and name it something like “Code” or “Projects” (I’m going with “Projects”). You could create the folder using the terminal:

mkdir ~/Projects

Be sure that you’ve set the PATH variable. Then perform the next commands. Also ensure to use the same name for PROJECT_HOME that you’ve given the folder you just created. Setting PROJECT_HOME is totally optional but will turn out as a very convenient timesaver.

echo '# needed for virtualenvwrapper' >> ~/.bash_profile
echo 'export WORKON_HOME="$HOME/.virtualenvs"' >> ~/.bash_profile
# replace Projects with the name you gave your folder
echo 'export PROJECT_HOME="$HOME/Projects"' >> ~/.bash_profile
echo 'export VIRTUALENVWRAPPER_PYTHON=/usr/local/bin/python3' >> ~/.bash_profile
echo 'export VIRTUALENVWRAPPER_VIRTUALENV=/usr/local/bin/virtualenv' >> ~/.bash_profile
echo 'export PIP_REQUIRE_VIRTUALENV=true' >> ~/.bash_profile
echo 'source /usr/local/bin/virtualenvwrapper.sh' >> ~/.bash_profile

If you’ve set PIP_REQUIRE_VIRTUALENV to true, this line will prevent you from accidentally installing packages globally. Meaning, from now on you can only install packages when you’re working on a virtual environment. If you don’t like this behavior, simply omit that line or set it to false.

Then either close the terminal and re-open it or reload just your ~/.bash_profile file via

source ~/.bash_profile

Now you’re ready to create your first virtual environment.

I find virtualenv much more useful for web development though, where your projects demand for different versions like in the mentioned example. For scientific computing however, you’ll probably want each of your projects to always have the latest versions of their packages. You could however create a virtual environment for scientific purposes to which you switch to whenever you’re not working on web projects.

To create a virtual environment, use the mkvirtualenv command and give it a meaningful name, for example, “science” for the above mentioned environment for scientific computing:

mkvirtualenv science

You can create many more virtual environments this way and switch between them with the workon command and their respective names, for example, workon science.

If you’re done with a specific virtual environment, you can leave it with the command deactivate.

To delete a virtual environment, simply enter rmvirtualenv followed by the name of the virtual environment you want to delete.

Regarding the time saver: since we’ve created the ~/Projects folder and set the PROJECT_HOME variable, we’re able to start an entire new project with a single line of code. Examples:

mkproject my-personal-homepage

and

mkproject client-homepage

This command implicitly creates an associated virtual environment for new projects. We don’t need to perform mkvirtualenv anymore but can immediately switch to the virtual environment with workon my-personal-homepage.

Now let’s say you’re ready to make the switch and to work exclusively with virtual environments. So far, however, you’ve installed each and every package globally. How do you move all your existing packages into your newly created but empty science environment? There are several options:

One possibility is to grant your science environment access to your globally installed packages. You would do this by entering toggleglobalsitepackages into your shell while you’re working on the virtual environment of choice. Then you would be able to leave the restricting cage of your virtual environment, somewhat defeating the whole purpose of a virtual environment. Additionally, this method can become quite messy once you’ve installed a few more packages, because from then on the packages will be located in two different folders.
The second possibility is to uninstall everything pip-related and start fresh. You would, however, have to install all packages again which is inconvenient. To uninstall the existing packages, you could use one these commands which both do exactly the same:
- pip3 freeze | xargs pip3 uninstall -y
- pip3 list | awk '{print $1}' | xargs pip3 uninstall -y
A third option would be to first copy your existing packages into your science environment before you uninstall all packages, except for virtualenv and virtualenvwrapper of course. Otherwise, how would you create new virtual environments without these packages? 😜

Moving Globally Installed Packages Into a Virtual Environment

If you chose the third possibility, read on. Otherwise continue to the next section.

First deactivate the science environment. Then use the following command to create a text file which contains a list with the names of each of your installed packages, together with their respective version number.

pip3 freeze > ~/Desktop/requirements.txt

You’ll find this newly created text file at your desktop. Open it with any text editor or use this command to open it for you:

open -e ~/Desktop/requirements.txt

Remove virtualenv and virtualenvwrapper from that list and hit save, since those two packages will be the only two packages which will remain being installed globally.

Next, uninstall every package except virtualenv and virtualenvwrapper like so:

pip3 uninstall -r ~/Desktop/requirements.txt

If there’s any package you don’t want to get installed again, open the text file again and remove those packages.

Now activate your science environment via workon science and install the remaining packages from your list into your virtual environment:

pip3 install -r ~/Desktop/requirements.txt

When you’re done and you don’t need the text file for further virtual environments, delete it—either manually or like so:

rm ~/Desktop/requirements.txt

Installing Qt and PyQt (Prerequisite)

Later on, we’re going to install IPython. IPython is a significant enhancement to the Python console. But before that, we first need to install some prerequisites.

The Qt framework is a popular toolkit typically being used for GUI’s in C++ applications. With an additional Python binding called PyQt, we can make use of Qt for Python applications such as IPython, too. SIP is just another required dependency in order to use IPython.

brew install qt5
brew install sip --with-python3
brew install pyqt5

Installing Qt Creator for GUI Development (Optional)

So far, what we’ve done is only install the libraries that make GUI development possible (and the usage of applications built on the Qt framework, of course). If you’d actually like to create a GUI for your Python app, you would additionally need to install the Qt Designer which is now integrated into the Qt Creator (a popular C++ IDE similar to JetBrain’s CLion).

brew install --cask qt-creator

In case you intend to develop a GUI for your application, you need a full installation of Xcode, not just the Xcode Command Line Tools. In order to install Qt Creator, the IDE necessary to develop applications with a Qt GUI, the Xcode Command Line Tools alone won’t be sufficient. Simply download Xcode from the Mac App Store and you’re good to go.

Installing the SciPy Stack

It has taken us a long way, but now that we know the best practices (like using virtual environments) and we installed all the prerequisites, we’re finally ready to do what you’re actually here for: to install the SciPy Stack. The SciPy Stack consists of NumPy, SciPy, Matplotlib, IPython, pandas, Sympy, and nose.

In order to compile matplotlib, you first need to install pkg-config, libpng and freetype if they aren’t installed already. They are required dependencies for configuring Matplotlib when it is being compiled as well as for manipulating PNG image files and rendering fonts (i.e., for displaying text in your plots). The command to install these programs is self-explanatory.

brew install pkg-config libpng freetype

Now we can install the rest of the SciPy Stack (we’ll leave out IPython for now, I’ll come back to this one in more detail later on). Install NumPy first, since the rest builds up on it.

pip3 install numpy scipy matplotlib pandas sympy nose

Notice that we first used the brew command and then pip. Best practice is to prefer pip over brew for Python-specific packages. In most cases, pip will give you a newer version than brew and works better with virtualenv. Use Homebrew only if the desired package couldn’t be installed via pip because it is for example, not Python-specific. Generally, packages should be installed either via Homebrew or via pip but never via both.

Troubleshooting

In case you get an error message saying something about scipy couldn’t be installed because gfortran was missing, you would need to install the “real” gcc:

brew install gcc

The reason for the error is the following: gfortran is part of GCC and GCC in turn is part of the Xcode Command Line Tools. However, the version of gcc contained in the Xcode Command Line Tools isn’t actually GCC but just a disguised Clang.³ Usually, this isn’t a problem but in some cases, it can happen that you need the “real” gcc. Depending on the configuration of gcc you are about to install (i.e., which parameters you set etc.), installing gcc can either be very quick or take a very long time.

As a little aside for the interested: compiling gcc yourself takes a tremendous amount of CPU power. Your MacBook will very likely get very hot and very loud in the process (because all the fans will be spinning with maximum speed). This is normal and no need to be worried. In case you accidentally started the installation process or are too worried about the heat, you can abort the process at any time without any problems. But unless you have the need for some special configuration, you should go with the so-called “bottled” (i.e., pre-compiled) version of gcc. In case you haven’t already noticed, Homebrew uses the analogy of brewing beer a lot; instead of downloading the source code of an application and then building that tool locally on your computer from source (i.e., brewing the beer yourself), Homebrew just “pours” you a pre-made “bottle” (of beer). In other words: someone else (who has a much faster computer than you) already did the work of compiling for you and you can just download the finished result which is very quick. However, this person cannot take every single possible configuration into account, so if you want any special configuration, you are left again with compiling the source code yourself.

On IPython And Jupyter

So far, I’ve intentionally left out IPython. You may be wondering why that is. Let me explain.

IPython is a very powerful interactive shell for Python and the de-facto standard for scientific computing. It is much better than the interactive mode of Python, i.e., what you see when you type python3 into the terminal. You would invoke IPython by entering ipython in your Terminal instead of python3.

Now, you could of course just use JetBrain’s PyCharm for your projects, but being a full-fledged and sometimes sluggish development environment PyCharm can sometimes be an absolute overkill. IPython on the other hand, in addition with yet another enhancement called Qt Console, becomes so powerful you don’t even need an IDE anymore.

What Qt Console does is adding a GUI to ipython. Thereby, it provides even more features that wouldn’t be possible without a GUI, for example, inline figures, proper multiline editing with syntax highlighting, graphical calltips, and much more. That’s why you need to have Qt/PyQt installed by now. If you skipped that part earlier, go back and catch up on it.

In 2014, IPython and all of its subprojects became the Project Jupyter. Thus the command to install IPython was renamed. Now, you can install everything you need using just the following command:

pip3 install jupyter

To test whether the installation was successful, you can either try to open the Qt Console

jupyter qtconsole

or run IPython’s test suite:

iptest

Before 2014, you had to install IPython with a different command which is now deprecated. Similar to easy_install and PYTHONPATH, I want to tell you anyway so that you know what you should not use:

pip3 install ipython[all]

The [all] parameter automatically installed all the main optional dependencies like PyZMQ, Pygments, Jinja, Tornado or MathJax (needed for Qt Console and Jupyter Notebook).⁴ Without the [all] parameter, you would’ve needed to install them separately. However, as I mentioned above, this command replaced by pip3 install jupyter and is not used anymore.

Customizing Jupyter and Qt Console (optional)

You can even customize the Qt Console if you want to and use a better font like the popular Source Code Pro from Adobe after you’ve installed it via Homebrew Cask:

brew tap homebrew/cask-fonts
brew install --cask font-source-code-pro
jupyter qtconsole --ConsoleWidget.font_family="Source Code Pro" --ConsoleWidget.font_size=14

Jupyter Notebook

Jupyter Notebook is also part of the Jupyter Project and was formally known as IPython Notebook. It combines your data analysis tool with a word processor. Jupyter Notebook let’s you create text files that contain embedded executable Python code. This way, you can easily annotate your Python code with LaTeX code and present your results to other people. Gone are the days when you had to copy your results from Matlab, Maple, Excel etc. to Microsoft Word and constantly switch between applications.

First you cd (change directory) to the directory where you want to store your text files or open them from, and then simply open Jupyter Notebook like this:

jupyter notebook

This will start a local web server. To view the Notebook Dashboard, open http://localhost:8888 in your web browser (it’s a web application).

In order to be able to convert notebooks to various formats other than HTML (for example, PDF), you’ll need to install Pandoc (a dependency for nbconvert):

pip3 install pandoc

Installing TensorFlow

There are two ways to install TensorFlow if you followed the pip approach instead of the conda approach. For the longest time, TensorFlow wasn’t yet available via pip, which is why many (outdated) guides still describe the old way. That isn’t necessary anymore, however.

The New Way

You should be able to install TensorFlow like so:

pip3 install tensorflow

I say “should” because TensorFlow was added to PyPI (the “App Store” for pip) comparatively recently. For whatever reason the installation through pip does not always work. If that’s the case for you, don’t worry. You can still install TensorFlow “the old way”.

The Old Way

First check out TensorFlow’s Installing TensorFlow on Mac OS X site to find out the correct download link. There’s a “CPU only” version and a “GPU enabled” version available. To make use of GPU support in TensorFlow you’ll need a Graphics Card with CUDA support which are Nvidia cards only. So unless you have a discrete graphics card with a Nvidia GPU, go for the “CPU only” version.

You could copy the export-command from their page and manually paste it into your ~/.bash_profile file—or do it much quicker like so:

# please check the website first if there's a newer URL available
echo 'export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.0.1-py3-none-any.whl' >> ~/.bash_profile
pip3 install --upgrade $TF_BINARY_URL

Whichever method you choose, remember to not use sudo (contrary to what the TensorFlow website says). It’s not necessary for the reasons mentioned above. On the contrary. Using sudo will cause an error message. The reason for this error message is that pip uses caching by default (just like web browsers are caching the sites you’ve already visited). In this particular case, caching means writing write into ~/Library/Caches/pip and ~/Library/Caches/pip/http. But the root user cannot write into either of them, since he doesn’t own your home folder. The caching of the package will fail, thus you’ll get an error message informing you about that. Omitting sudo—therefore installing TensorFlow as regular user—will solve the caching problem. If you’re still having trouble, retry with pip3 install --user instead of sudo pip3 install. Should you not need caching, instead of misusing sudo rather use the --no-cache-dir option.

Updating Packages

Updating Python packages with pip still is tiresome because you have to update each package individually. You can’t update all packages at once like you can with conda. Nevertheless, you should of course keep your packages up to date and perform the following procedure regularly to always work with the latest versions. First, you need to find out which packages are outdated:

pip3 list --outdated

Then you can update them by listing their names as parameters behind the --upgrade flag (or its shortcut -U), just like when you updated pip itself earlier in this guide:

pip3 install -U package1 package2 package3 ...

Since manually updating every package is cumbersome, you’ll want to use the following command instead to automatically update all outdated packages without having to list their individual names:

pip3 list --outdated --format=freeze | grep -v '^\-e' | cut -d = -f 1 | xargs -n1 pip3 install --upgrade

Interview With the Creator of Anaconda (And Other Things)

Anaconda was created by Travis Oliphant. Travis also created NumPy, SciPy, and other tools that are indispensable today. Here is a fantastic and insightful interview with him.

Final Words

I gave my best to explain everything as beginner-friendly as possible, since the authors of all the other articles I found when I was learning this stuff were assuming I already knew what the PATH variable was or that I had a deep understanding of the UNIX file system. Thus I spent a long time reading on Stack Overflow et cetera and had to learn this all on my own the hard way. It would make me really happy if I can spare you this effort.

If you found this article helpful or still have any questions, why don’t you leave a comment down below? It would be greatly appreciated 😊

Even if, then PATH would be a way more appropriate place than PYTHONPATH. But since PATH already includes the location of the Python interpreter, /usr/local/bin, setting PYTHONPATH is absolutely unnecessary. ↩︎
For further information see http://apple.stackexchange.com/a/223163 ↩︎
Meanwhile, many companies like Apple or Google are preferring Clang to GCC as their compiler front end. To make the transition smoother, Apple symlinks Clang executables with GCC-like names. Under very specific circumstances (for example, certain GCC parameters Clang doesn’t know), this can lead to errors. ↩︎
With PyZMQ being only a Python binding for ZeroMQ (the messaging library behind it), you’d also need zmq for PyZMQ to work. However, the setup routine for PyZMQ is intelligent enough to install zmq by itself. Thus there’s no need for you to first install zmq via Homebrew. ↩︎