"…mais ce serait peut-être l'une des plus grandes opportunités manquées de notre époque si le logiciel libre ne libérait rien d'autre que du code…"

Python package management: critiques de setuptools et louanges de pip et virtualenv

Posted by patrick sur décembre 16, 2008

La compréhension de la gestion des packages Python est devenue essentielle avec l’arrivée de 2 nouvelles versions de Python: python 2.6 et Python 3000 (ou Python 3k)

Voici quelques articles récents:

http://blog.ianbicking.org/2008/12/14/a-few-corrections-to-on-packaging/ (‘James Bennett recently wrote an article on Python packaging and installation, and Setuptools. There’s a lot of issues, and writing up my thoughts could take a long time, but I thought at least I should correct some errors, specifically category errors. Figuring out where all the pieces in Setuptools (and pip and virtualenv) fit is difficult

pip:
This is an alternative to easy_install. It works somewhat differently than easy_install, but not much. Mostly it is better than easy_install, in that it has some extra features and is easier to use. Unlike easy_install, it downloads all distributions up-front, and generates the metadata to read distribution and version requirements. It uses Setuptools to generate this metadata from a setup.py file, and uses pkg_resources to parse this metadata. It then installs packages with the setuptools monkeypatches applied. It just happens to use an option python setup.py –single-version-externally-managed, which gets Setuptools to install packages in a more flat manner, with Distro.egg-info/ directories alongside the package. Pip installs eggs! I’ve heard the many complaints about easy_install (and I’ve had many myself), but ultimately I think pip does well by just fixing a few small issues. Pip is not a repudiation of Setuptools or the basic mechanisms that easy_install uses.
virtualenv:
This is a little hack that creates isolated Python environments. It’s based on virtual-python.py, which is something I wrote based on some documentation notes PJE wrote for Setuptools. Basically virtualenv just creates a bin/python interpreter that has its own value of sys.prefix, but uses the system Python and standard library. It also installs Setuptools to make it easier to bootstrap the environment (because bootstrapping Setuptools is itself a bit tedious). I’ll add pip to it too sometime. Using virtualenv you don’t have to worry about different library versions, because for any one environment you will probably only need one version of a library. On any one machine you probably need different versions, which is why installing packages system-wide is problematic for most libraries. (I’ve been meaning to write a post on why I think using system packaging for libraries is counter-productive, but that’ll wait for another time.

…I don’t think zipping libraries up is all that useful, and while it should work, it doesn’t always, and it makes code harder to inspect and understand. So since it’s not that useful, I’ve disabled it when pip installs packages. I also have had it disabled on my own system for years now, by creating a distutils.cfg file with [easy_install] zip_ok = False in it. Sadly App Engine is forcing me to use zip files again, because of its absurdly small file limits… but that’s a different topic. (There is an experimental pip zip command mostly intended for App Engine.

…Another pain point is version management with setup.py and Setuptools. Indeed it is easy to get things messed up, and it is easy to piss people off by overspecifying, and sometimes things can get in a weird state for no good reason (often because of easy_install’s rather naive leap-before-you-look installation order). Pip fixes that last point, but it also tries to suggest more constructive and less painful ways to manage other pieces.

Pip requirement files are an assertion of versions that work together. setup.py requirements (the Setuptools requirements) should contain two things: 1: all the libraries used by the distribution (without which there’s no way it’ll work) and 2: exclusions of the versions of those libraries that are known not to work. setup.py requirements should not be viewed as an assertion that by satisfying those requirements everything will work, just that it might work. Only the end developer, testing the system together, can figure out if it really works. Then pip gives you a way to record that working set (using pip freeze), separate from any single distribution or library….

  1. I am new to pip, so I hope I don’t write silly things ;)It’s a good idea to have the requirements outside the package, instead of what setuptools does. So you can share them amongst several packages.

    So basically, pip’s requirements files are what Zope calls the “Known Good Set” and what Turbogears 2 does by maintaining its own PyPI server to distribute TG2 packages : a list of versions that are known to interact well in the same environment, right ?

    But it’s not really different from setuptools there, except that you change the requirements in a different place if something goes wrong.

    So to simplify the problem, couldn’t we have juste ONE requirement file in the whole Python ?

    This could be the clue for os packagers : they would be able to tweak this file, while developers would be able to try out their package over different requirements files (”the debian etch python requirement file” “the debian unstable python requirement file”, etc). And for specific, isolated stuff, using virtualenv would allow developer to have their own custome requirement files….

  1. As Brett points out, apt and other package managers are inappropriate because they don’t allow multiple versions of packages to be installed, local environments, and ad hoc packages

  1. Hey Ian,

    This is an excellent and unbiased summary. More or less the missing manual of the current state of packaging under Python.

    I need to add that I really like pip. It doesn’t change much to easy_install, but it is day and night to me. Thanks for making it available.)

  • http://www.b-list.org/weblog/2008/dec/15/pip/ (‘Why I like pip. So yesterday I explained some of the reasons why I don’t like setuptools. In essence, my objections boil down to one idea: application packagingdevelopment should be orthogonal concerns. The way setuptools works, however, seems to tend, inevitably, toward coupling them to each other…But toward the end of yesterday’s article I suggested pip as an alternative to pkg_resources/easy_install toolchain, and today I’d like to explain a bit more about why I prefer pip and some of the concrete benefits it offers and application as an alternative to the setuptools… in his response to my article yesterday Ian clarified that pip just uses the setuptools APIs to do this. Which is saying something: glossing over setuptools’ warts to the point where you don’t even realize it’s being used is a pretty big deal… pip looks before it leaps, can bail out early if it’s not going to be able to install your package and will leave behind a useful log file explaining what went wrong…The point where pip really shines, though, is in the ease of specifying and creating reproducible builds. If you’ve ever dealt with having to deploy the same code base across multiple machines, you know what a headache this can be, since a huge number of factors (operating system and version, pre-installed packages and versions, system package managers and configuration, etc., etc.) can change the results of your deployment process, sometimes in subtle and difficult-to-debug ways. With pip, this is not (so much of) a problem…I mentioned pip requirements files yesterday as an alternative to the way setuptools specifies dependencies directly in setup.py, and that’s certainly one useful application of the feature, but you can take it much further: once you know which packages (and, just as important, which versions of which packages) you need, you can write them down in a simple, plain-text file, point pip at it, and it’ll install them…The last piece of the puzzle, for me, is virtualenv; virtualenv is a tool for creating and working with isolated Python environments, and is basically the only way I work with Python these days…pip integrates quite nicely with virtualenv; normally, when working in an active virtualenv, Python packaging/installation tools (pip included) will install into that virtualenv, but pip also lets you:
  • Specify a virtualenv to install into (using the -E flag), and
  • Create a new virtualenv and install into it.

The second one is really the killer feature, because it means you can set up a requirements file specifying a list of packages, and get pip to create a virtualenv for you and install the packages into it…… well, there’s a heck of a lot more I could write here about pip (and about virtualenv, and about some other interesting tools), but I think this is a good start and hopefully I’ve at least got you interested enough to explore a bit on your own. And I hope I’ve managed to communicate some of the practical reasons why I’ve ditched easy_install for package installation; compared to what pip can do right now (not even considering what it might be able to do in the future), easy_install just doesn’t measure up enough to justify the headaches it can create… »)

  • http://tarekziade.wordpress.com/2008/12/15/python-isolated-environment-pie/ (‘Right now, when Python is loaded, it uses the site module to browse the site-packages directory to populate the path with packages it find there.  .pth files are also parsed to provide extra paths. Python 2.6 has introduced per-user site-packages directory, where you can define an extra directory, which is added in the path like the central one. But both will append new paths to the environment without any rule of exclusion or version checking…A few workarounds exist to be able to express what packages (and version) an application needs to run, or to set up an isolated environment for it:
  • setuptools provides the install_requires mechanism where you can define dependencies directly inside the package, as a new metadata. It also provides a way to install two different versions of one package and let you pick by code or when the program starts, which one you want to activate.
  • virtualenv will let you create an isolated Python environment, where you can define your own site-packages. This allows you to make sure you are not conflicting with a incompatible version of a given package.
  • zc.buildout relies on setuptools and provides an isolated environment a bit similar in some aspects to virtualenv.
  • pip provides a way to describe requirements in a file, which can be used to define bundles, which are very similar to what zc.buildout provides.

But they all aim at the same goal : define a specific execution context for a specific application, and declare dependencies with no respect to other applications or to the OS environment…This proposal describes a solution that can be added to Python to provide that feature. A new file called a  Python Isolated Environment file (PIE file) can be provided by any  application to define the list of dependencies and their versions…’)

Laisser un commentaire

Entrez vos coordonnées ci-dessous ou cliquez sur une icône pour vous connecter:

Logo WordPress.com

Vous commentez à l'aide de votre compte WordPress.com. Déconnexion / Changer )

Image Twitter

Vous commentez à l'aide de votre compte Twitter. Déconnexion / Changer )

Photo Facebook

Vous commentez à l'aide de votre compte Facebook. Déconnexion / Changer )

Photo Google+

Vous commentez à l'aide de votre compte Google+. Déconnexion / Changer )

Connexion à %s

 
%d blogueurs aiment cette page :