So I just finished my internship with Continuum. For the internship, I primarily worked on Anaconda, their free Python distribution, and conda, its free (BSD open source) package manager. I might write a blog post about conda later, but suffice it to say that I’m convinced that it is doing package management the right way. One of the major developments this summer that I helped out with was the ability for anybody to build a conda package, and a site called Binstar where people can upload them (the beta code is “binstar in beta” with no quotes).
Another thing that happened over the summer is that Almar Klein made conda Python 3 compatible, so that it can be used with the Pyzo project, which is Python 3 only. The way this was done was by using a single code base for Python 2 and Python 3. Thus, this became the first time I have done any heavy development on Python source that had to be Python 3 compatible from a single codebase (as opposed to using the 2to3 tool).
Another development this summer was that SymPy was released (0.7.3). This marked the last release to support Python 2.5. Around the same time, we discussed our Python 3 situation, and how annoying it is to run use2to3 all the time. The result was this pull request, which made SymPy use a single code base for Python 2 and Python 3. Now, that pull request is hard to mull through, but the important part to look at is the compatibility file. Everything in that file has to be imported and used, because it represents things that are different between Python 2 and Python 3. Ondřej has written more about this on his blog.
In all, I think that supporting Python 2.6-3.3 (not including 3.0 or 3.1) is not that bad. The compatibility file has a few things, but thinking back, it was just that bad or worse supporting Python 2.4-2.7 (heck, back then, we couldn’t even use the
all function without importing it). The situation is much better today now that we use Travis too, since any mistake is caught before the pull request is merged. The worst of course is the
__future__, I will be warned about it pretty fast, since
Of course, there are a lot of nice Python 3 only features that we cannot use, but this was the case for supporting Python 2.4-2.7 too (e.g., the with statement and the ternary statement were both introduced in Python 2.5). So this is really nothing new. There is always a stick to drop the oldest Python version we support, and a lag on what features we can use. Now that we have dropped Python 2.5 support in SymPy, we can finally start using new-style string formatting, abstract base classes, relative imports, and keyword arguments after
So as a result of this, I’ve come to the conclusion that Python 3 is not another language. It’s just another version of the same language. Supporting Python 2.6-3.3 is no different from supporting Python 2.4-2.7. You have to have some compatibility imports, you can’t use new language features, and you have to have good test coverage. I think that the core Python folks made a mistake by presenting Python 3 as a new language. It has made people antagonistic against Python 3 (well, that and the
In the past, I have always developed against the latest version of Python: 2.6 was the best when I learned Python, and then 2.7. Even though I have had to support back to 2.4, I only used 2.4 explicitly when testing.
Well, given what I said above, the only logical thing to do is to use Python 3.3 as my main development Python. If you use Anaconda, there are basically two ways you can do this. The first is to just create a Python 3 environment (
conda create -n python3 python=3), and put that first in your
PATH (you also will need to add
source activate python3 to your bash profile if you go this route, so that
conda install will install into that environment by default). For me, though, I plan to use a Python 3 version of Anaconda, which has Python 3 as the default. The main difference here is that
conda itself is written in Python 3. Aside from purity, and the fact that I plan to fix any occasional conda bugs that I come across, the other difference here is that conda itself will default to Python 3 in this case (i.e., when creating a new environment with Python like
conda create -n envname python, the Python will be Python 3, not Python 2, and also it will build against Python 3 by default with
conda build). Continuum does not yet make Python 3 versions of Anaconda, but there are Python 3 versions of Miniconda (Miniconda3), which is a stripped down version of Anaconda with just Python, the conda package manager, and its dependencies. You can easily install Anaconda into it though with
conda install anaconda. I personally prefer to install only what I need to keep the disk usage low (on an SSD, disk space is sparse), so this is perfect for me anyway.
My recommendation is to put a Python 2 installation second in your PATH, so that you can easily call
python2 if you want to use Python 2. The easiest way to do this is to create a conda environment for it (
conda create -n python2 python=2) and add
~/anaconda/envs/python2 to your PATH.
So far, I have run into a few issues:
- Some packages aren’t build for Python 3 yet in Anaconda, or they don’t support it at all. The biggest blocker in Anaconda is PySide (at least on Mac OS X), though it should be coming soon.
- Some packages only install entry points with a “3″ suffix, which is annoying. The biggest offender here is IPython. I brought up this issue on their mailing list,
so hopefully they will see the light and fix this before the next release, but it hasn’t been implemented yet. I also plan to make sure that the Anaconda package for IPython installs an
ipythonentry point into Python 3 environments. Even so, one has to remember this when installing old versions of IPython in environments.
- There are some bugs in conda in Python 3. Actually, I suspect that there are bugs in a lot of packages in Python 3, because people don’t develop against it, unless they have excellent test coverage. Even SymPy missed a few print statements.
- You can’t
setup.py developagainst anything that uses 2to3 (like IPython).
- It’s a little annoying working against old versions of SymPy (e.g., when digging through the git history to track something down), because I have to explicitly use Python 2. Conda makes this easier because I can just create a Python 2 environment and do
source activate python2when I am using Python 2. Or, for a one-off, I can just use
python2, and keep a Python 2 environment second in my PATH. But this issue is not really new. For example, really old versions of SymPy only work with Python 2.5, because they used
asas a variable name.
- Everyone else isn’t using Python 3 yet, so if I write a script that only needs to support “the latest version of Python,” it probably needs to support Python 2.7, or else I should explicitly put
/usr/bin/env python3in the shebang line. But for SymPy, I have to be aware of how to support 2.6-3.3, so I have to know all the features that are only in some versions anyway. On the other side of things, if I run some random Python script with a shebang line, it probably is going to expect Python 2 and not Python 3, so I either have to explicitly add
python2to the command or activate a Python 2 environment
- Some packages just don’t support Python 3 yet. Fabric (and its main dependency, Paramiko) is the one example I have come across so far in my own work. So I have to fall back to Python 2 if I want to use them. The best thing to do here is to pitch in and help these libraries port themselves.
- People always give code examples with
I’ll update this list as I come across more issues.
In all, so far, it’s nothing too bad. Conda makes switching back to Python 2 easy enough, and dealing with these issues are hardly the worst thing I have to deal with when developing with Python. And if anything, seeing Python 2-3 bugs and issues makes me more aware of the differences between the two versions of the language, which is a good things since I have to develop against code that has to support both.