source / license: CC0

How to setup Python environment and install pip modules?

02/06/2018

Finding the proper way of installing, upgrading and using python-pip seems to be a daunting task, at least at the beginning. The Python 2 / 3 schism complicates it even further as many packages are not backward-compatible. Moreover, installing or even upgrading packages globally (via sudo /root) may spawn unexpected errors. Below I present common practices about how to set up Python environment and mention about old, deprecated solutions that you may still hear about.

What is PIP?

PIP is a recursive acronym that stands for “PIP Installs Packages” or “Preferred Installer Program”. It’s a command-line utility that allows you to install, reinstall, or uninstall PyPI packages with a simple and straightforward command: pip. Pip for python is like apt / yum / dnf / pacman for your system.

You can find pip packages in a repository of software for the Python programming language - on Python Package Index PyPi. Think of it as a github for python packages. It’s similar to RubyGems in the Ruby world, PHP’s Packagist, CPAN for Perl, and NPM for Node.js.

How to install pip

If you use Linux/Mac chances are, you already have it. If not, search for python-pip or similar. For example, for Ubuntu 16.04 the packages are:

  • python-pip - for Python 2
  • python3-pip - for Python 3

To upgrade: according to official installation: If you have pip already installed:

pip install --upgrade --user pip

See also “to sudo pip or not to sudo” section. Else go with:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py

pip vs easy_install

One of the old yet recurring methods to manage python modules is easy_install. As official documentation states:

easy_install was released in 2004, as part of setuptools. It was notable at the time for installing packages from PyPI using requirement specifiers, and automatically installing dependencies.

pip came later in 2008, as alternative to easy_install, although still largely built on top of setuptools components. It was notable at the time for not installing packages as Eggs or from Eggs (but rather simply as ‘flat’ packages from sdists), and introducing the idea of Requirements Files, which gave users the power to easily replicate environments.

So, to sum up: pip is the newer more modern version of installing python packages. It is officially a preffered way to install python modules. Unless stated otherwise, go with pip. For more comparison go to this explanation or this blogpost.

To sudo pip or not to sudo

Whether one shall not use sudo pip install package is a frequent topic of Stackoverflow’s comments. As Piotr Dobrogost states:

Besides obvious security risks (which I think are in fact low when you install software you know) brought in other answers there is another reason. Python that comes with the system is part of this system and when you want to manage system you use tools designated for system maintenance like package manager in case of installing/upgrading/uninstalling software. When you start to modify system’s software with third party tools (pip in this instance) then you have no guarantee about the state of your system.

Jamie Matthews in his blogpost muses upon why is it a bad idea to use sudo pip install. TL;DR - it breaks dependencies in your projects.

Fedora gives even more bold advice (repetition unmodified from original text):

Never ever ever ever use pip or pip3 with sudo. Use pip –user or Python virtual environments instead.

Nevertheless, there are still many well-known projects like ansible or youtube-dl or even the module virtualenv that encourage to use sudo pip install

So in the end - as always - it depends. If you know what you’re doing and want to install package globally with potential risk of breaking dependencies in the long run - go with sudo / root install. Otherwise, go with --user

Local installation

Virtual environment

A virtual environment is a semi-isolated Python environment that allows packages to be installed for use by a particular application, rather than being installed system wide. Quoting docs:

The basic problem being addressed is one of dependencies and versions, and indirectly permissions. Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications? If you install everything into /usr/lib/python2.7/site-packages (or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.

There are many packages and script wrappers responsible to create virtual environments, three of them most common:

  • venv is the standard tool for creating virtual environments, and has been part of Python since Python 3.3. Starting with Python 3.4, it defaults to installing pip into all created virtual environments. At the time of writing, it is the officially recommended way to isolate application specific dependencies albeit virtualenv is still more commonly used due to it’s cross-version consistency.
  • virtualenv It’s not part of Python’s standard library, but is officially blessed by the PyPA (Python Packaging Authority). It’s much older than venv and more importantly, supports Python 2. It seems that it’s most widely supported package and in 2018 we’re in a state of tug of war between venv and virtualenv.
  • pyvenv - deprecated since Python 3.6. Used to isolate Python versions.

venv vs virtualenv?

From reddit:

venv by nature of being part of Python itself has access to the internals of Python which means it can do things the right way with far fewer hacks. For example, virtualenv has to copy the Python interpreter binary into the virtual environment to trick it into thinking it’s isolated, whereas venv can just use a configuration file that is read by the Python binary in its normal location for it to know it’s supposed to act like it’s in a virtual environment. So venv can be thought of virtualenv done right, with the blessing and support of the Python developers.

TL;DR: while venv is more modern and recommended way, most of developers still use virtualenv.

pip install --user package_name

If you do not want to create virtualenv and just want to grab some program via pip, you can use --user flag directly form bash command line and install it locally .

Pipenv anyone?

Pipenv is a project that aims to bring the best of all packaging worlds to the Python world. It harnesses Pipfile, pip, and virtualenv into one single toolchain. In addition to addressing some common issues, it consolidates and simplifies the development process to a single command line tool.

It helps you manage all those pesky dependencies just like Bundler does for Ruby or Carton does for Perl.

And most importantly, it’s officialy recommended by The Python Packaging User Guide

Closing thoughts

We’ve gone through most common solutions in setting Python. Some of them are not used anymore as Python is a vivid language under constant development which makes it a moving target to grasp. And Python 4 is around the corner! But remember: start low, go slow. There is no need to know everything all at once. Happy coding!