Python package management: critiques de setuptools et louanges de pip et virtualenv
Posted by patrick sur décembre 16, 2008
La compréhension de la gestion des packages Python est devenue essentielle avec l’arrivée de 2 nouvelles versions de Python: python 2.6 et Python 3000 (ou Python 3k)
Voici quelques articles récents:
– http://blog.ianbicking.org/2008/12/14/a-few-corrections-to-on-packaging/ (‘James Bennett recently wrote an article on Python packaging and installation, and Setuptools. There’s a lot of issues, and writing up my thoughts could take a long time, but I thought at least I should correct some errors, specifically category errors. Figuring out where all the pieces in Setuptools (and pip and virtualenv) fit is difficult…
- This is an alternative to easy_install. It works somewhat differently than easy_install, but not much. Mostly it is better than easy_install, in that it has some extra features and is easier to use. Unlike easy_install, it downloads all distributions up-front, and generates the metadata to read distribution and version requirements. It uses Setuptools to generate this metadata from a setup.py file, and uses pkg_resources to parse this metadata. It then installs packages with the setuptools monkeypatches applied. It just happens to use an option python setup.py –single-version-externally-managed, which gets Setuptools to install packages in a more flat manner, with Distro.egg-info/ directories alongside the package. Pip installs eggs! I’ve heard the many complaints about easy_install (and I’ve had many myself), but ultimately I think pip does well by just fixing a few small issues. Pip is not a repudiation of Setuptools or the basic mechanisms that easy_install uses.
- This is a little hack that creates isolated Python environments. It’s based on virtual-python.py, which is something I wrote based on some documentation notes PJE wrote for Setuptools. Basically virtualenv just creates a bin/python interpreter that has its own value of sys.prefix, but uses the system Python and standard library. It also installs Setuptools to make it easier to bootstrap the environment (because bootstrapping Setuptools is itself a bit tedious). I’ll add pip to it too sometime. Using virtualenv you don’t have to worry about different library versions, because for any one environment you will probably only need one version of a library. On any one machine you probably need different versions, which is why installing packages system-wide is problematic for most libraries. (I’ve been meaning to write a post on why I think using system packaging for libraries is counter-productive, but that’ll wait for another time.
…I don’t think zipping libraries up is all that useful, and while it should work, it doesn’t always, and it makes code harder to inspect and understand. So since it’s not that useful, I’ve disabled it when pip installs packages. I also have had it disabled on my own system for years now, by creating a distutils.cfg file with [easy_install] zip_ok = False in it. Sadly App Engine is forcing me to use zip files again, because of its absurdly small file limits… but that’s a different topic. (There is an experimental pip zip command mostly intended for App Engine.
…Another pain point is version management with setup.py and Setuptools. Indeed it is easy to get things messed up, and it is easy to piss people off by overspecifying, and sometimes things can get in a weird state for no good reason (often because of easy_install’s rather naive leap-before-you-look installation order). Pip fixes that last point, but it also tries to suggest more constructive and less painful ways to manage other pieces.
Pip requirement files are an assertion of versions that work together. setup.py requirements (the Setuptools requirements) should contain two things: 1: all the libraries used by the distribution (without which there’s no way it’ll work) and 2: exclusions of the versions of those libraries that are known not to work. setup.py requirements should not be viewed as an assertion that by satisfying those requirements everything will work, just that it might work. Only the end developer, testing the system together, can figure out if it really works. Then pip gives you a way to record that working set (using pip freeze), separate from any single distribution or library….
- http://www.b-list.org/weblog/2008/dec/15/pip/ (‘Why I like pip. So yesterday I explained some of the reasons why I don’t like setuptools. In essence, my objections boil down to one idea: application packagingdevelopment should be orthogonal concerns. The way setuptools works, however, seems to tend, inevitably, toward coupling them to each other…But toward the end of yesterday’s article I suggested pip
as an alternative to pkg_resources/
easy_installtoolchain, and today I’d like to explain a bit more about why I prefer
pipand some of the concrete benefits it offers and application as an alternative to the setuptools… in his response to my article yesterday Ian clarified that
pipjust uses the setuptools APIs to do this. Which is saying something: glossing over setuptools’ warts to the point where you don’t even realize it’s being used is a pretty big deal…
piplooks before it leaps, can bail out early if it’s not going to be able to install your package and will leave behind a useful log file explaining what went wrong…The point where
pipreally shines, though, is in the ease of specifying and creating reproducible builds. If you’ve ever dealt with having to deploy the same code base across multiple machines, you know what a headache this can be, since a huge number of factors (operating system and version, pre-installed packages and versions, system package managers and configuration, etc., etc.) can change the results of your deployment process, sometimes in subtle and difficult-to-debug ways. With
pip, this is not (so much of) a problem…I mentioned pip requirements files yesterday as an alternative to the way setuptools specifies dependencies directly in
setup.py, and that’s certainly one useful application of the feature, but you can take it much further: once you know which packages (and, just as important, which versions of which packages) you need, you can write them down in a simple, plain-text file, point
pipat it, and it’ll install them…The last piece of the puzzle, for me, is virtualenv;
virtualenvis a tool for creating and working with isolated Python environments, and is basically the only way I work with Python these days…
pipintegrates quite nicely with
virtualenv; normally, when working in an active
virtualenv, Python packaging/installation tools (
pipincluded) will install into that
pipalso lets you:
- Specify a
virtualenvto install into (using the
- Create a new virtualenv and install into it.
The second one is really the killer feature, because it means you can set up a requirements file specifying a list of packages, and get
pip to create a virtualenv for you and install the packages into it…… well, there’s a heck of a lot more I could write here about
pip (and about
virtualenv, and about some other interesting tools), but I think this is a good start and hopefully I’ve at least got you interested enough to explore a bit on your own. And I hope I’ve managed to communicate some of the practical reasons why I’ve ditched
easy_install for package installation; compared to what
pip can do right now (not even considering what it might be able to do in the future),
easy_install just doesn’t measure up enough to justify the headaches it can create… »)
- http://tarekziade.wordpress.com/2008/12/15/python-isolated-environment-pie/ (‘Right now, when Python is loaded, it uses the site module to browse the site-packages directory to populate the path with packages it find there. .pth files are also parsed to provide extra paths. Python 2.6 has introduced per-user site-packages directory, where you can define an extra directory, which is added in the path like the central one. But both will append new paths to the environment without any rule of exclusion or version checking…A few workarounds exist to be able to express what packages (and version) an application needs to run, or to set up an isolated environment for it:
- setuptools provides the install_requires mechanism where you can define dependencies directly inside the package, as a new metadata. It also provides a way to install two different versions of one package and let you pick by code or when the program starts, which one you want to activate.
- virtualenv will let you create an isolated Python environment, where you can define your own site-packages. This allows you to make sure you are not conflicting with a incompatible version of a given package.
- zc.buildout relies on setuptools and provides an isolated environment a bit similar in some aspects to virtualenv.
- pip provides a way to describe requirements in a file, which can be used to define bundles, which are very similar to what zc.buildout provides.
But they all aim at the same goal : define a specific execution context for a specific application, and declare dependencies with no respect to other applications or to the OS environment…This proposal describes a solution that can be added to Python to provide that feature. A new file called a Python Isolated Environment file (PIE file) can be provided by any application to define the list of dependencies and their versions…’)