## How to make attributes un-inheritable in Python using descriptors

April 6, 2013

For https://github.com/sympy/sympy/pull/1969, and previous work at https://github.com/sympy/sympy/pull/1901, we added the ability for the SymPy doctester to run or not run doctests conditionally depending on whether or not required external dependencies are installed. This means that for example we can doctest all the plotting examples without them failing when matplotlib is not installed.

For functions, this is as easy as decorating the function with @doctest_depends, which adds the attribute _doctest_depends_on to the function with a list of what dependencies the doctest depends on. The doctest will then not run the doctest unless those dependencies are installed.

For classes, this is not so easy. Ideally, one could just define _doctest_depends_on as an attribute of the class. However, the issue is that with classes, we have inheritance. But if class A has a docstring with a doctest that depends on some modules, it doesn’t mean that a subclass B of A will have a doctest that does.

Really, what we need to do is to decorate the docstring itself, not the class. Unfortunately, Python does not allow adding attributes to strings

>>> a = ""
>>> a.x = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'x'


So what we have to do is to create a attribute that doesn’t inherit.

I had for some time wanted to give descriptors in Python a try, since they are a cool feature, but also the second most complicated feature in Python (the first is metaclasses). If you don’t know what a descriptor is, I recommend reading this blog post by Guido van Rossum, the creator of Python. It’s the best explanation of the feature there is.

Basically, Python lets attributes define what happens when they are accessed (like a.x). You may already know that objects can define how their attributes are accessed via __getattr__. This is different. With descriptors, the attributes themselves define what happens. This may sound less useful, but in fact, it’s a very core feature of the language.

If you’ve ever wondered how property, classmethod, or staticmethod work in Python, the answer is descriptors. Basically, if you have something like

class A(object):
def f(self):
return 1
f = property(f)


Then A().f magically calls what would normally be A().f(). The way it works is that property defines the __get__ method, which returns f(obj), where obj is the calling object, here A() (remember in Python that the first argument of a method, usually called self is the object that calls the method).

Descriptors can allow method to define arbitrary behavior when called, set, or deleted. To make an attribute inaccessible to subclasses, then, you just need to define a descriptor that prevents the attribute from being accessed if the class of the calling object is not the original class. Here is some code:

class nosubclasses(object):
def __init__(self, f, cls):
self.f = f
self.cls = cls
def __get__(self, obj, type=None):
if type == self.cls:
if hasattr(self.f, '__get__'):
return self.f.__get__(obj, type)
return self.f
raise AttributeError


it works like this

In [2]: class MyClass(object):
...:     x = 1
...:

In [3]: MyClass.x = nosubclasses(MyClass.x, MyClass)

In [4]: class MySubclass(MyClass):
...:     pass
...:

In [5]: MyClass.x
Out[5]: 1

In [6]: MyClass().x
Out[6]: 1

In [80]: MySubclass.x
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-80-2b2f456dd101> in <module>()
----> 1 MySubclass.x

<ipython-input-51-7fe1b5063367> in __get__(self, obj, type)
8                 return self.f.__get__(obj, type)
9             return self.f
---> 10         raise AttributeError

AttributeError:

In [81]: MySubclass().x
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-81-93764eeb9948> in <module>()
----> 1 MySubclass().x

<ipython-input-51-7fe1b5063367> in __get__(self, obj, type)
8                 return self.f.__get__(obj, type)
9             return self.f
---> 10         raise AttributeError

AttributeError:


Note that by using the third argument to __get__, this works regardless if the attribute is accessed from the class or the object. I have to call __get__ on self.f again if it has it to ensure that the right thing happens if the attribute has other descriptor logic defined (and note that regular methods have descriptor logic defined—that’s how they convert the first argument self to implicitly be the calling object).

One could easily make class decorator that automatically adds the attribute to the class in a non-inheritable way:

def nosubclass_x(args):
def _wrapper(cls):
cls.x = nosubclasses(args, cls)
return cls
return _wrapper


This automatically adds the property x to the decorated class with the value given in the decorator, and it won’t be accessible to subclasses:

In [87]: @nosubclass_x(1)
....: class MyClass(object):
....:     pass
....:

In [88]: MyClass().x
Out[88]: 1

In [89]: MySubclass().x
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-89-93764eeb9948> in <module>()
----> 1 MySubclass().x

<ipython-input-51-7fe1b5063367> in __get__(self, obj, type)
8                 return self.f.__get__(obj, type)
9             return self.f
---> 10         raise AttributeError

AttributeError:


For SymPy, we can’t use class decorators because we still support Python 2.5, and they were introduced in Python 2.6. The best work around is to just call Class.attribute = nosubclasses(Class.attribute, Class) after the class definition. Unfortunately, you can’t access a class inside its definition like you can with functions, so this has to go at the end.

Name Mangling

After coming up with all this, I remembered that Python already has a pretty standard way to define attributes in such a way that subclasses won’t have access to them. All you have to do is use two underscores before the name, like __x, and it will be name mangled. This means that the name will be renamed to _classname__x outside the class definition. The name will not be inherited by subclasses. There are some subtleties with this, particularly for strange class names (names that are too long, or names that begin with an underscore). I asked about this on StackOverflow. The best answer is that there was a function in the standard library, but it was removed in Python 3. My tests reveal that the behavior is different in CPYthon than in PyPy, so getting it right for every possible class is nontrivial. The descriptor thing should work everywhere, though. On the other hand, getattr(obj, '_' + obj.__class__.__name__ + attributename) will work 99% of the time, and is much easier both to write and to understand than the descriptor.

## When does x^log(y) = y^log(x)?

March 3, 2013

In this blog post, when I write $\log(x)$, I mean the natural logarithm, or log base $e$, i.e., $\ln(x)$.

A discussion on a pull request got me thinking about this question: what are the solutions to the complex equation $x^{\log{(y)}} = y^{\log(x)}$?  At the outset, they look like different expressions.  But clearly there some solutions. For example, if $x = y$, then obviously the two expressions will be the same.  We probably should exclude $x = y = 0$, though note that even if $0^{\log(0)}$ is well-defined (probably if it is it is either 0 or complex $\infty$), it will be the same well-defined value. But for the remainder of this blog post, I’ll assume that $x$ and $y$ are nonzero.

Now, observe that if we apply $\log$ to both sides of the equation, we get $\log{\left(x^{\log(y)}\right )} = \log {\left (y^{\log(x)}\right )}$.  Now, supposing that we can apply the famous logarithm exponent rule, we would get $\log(x)\log(y) = \log(y)\log(x)$, which means that if additionally $\log$ is one-to-one, we would have that the original expressions must be equal.

The second question, that of injectivity, is easier to answer than the first, so I’ll address it first.  Note that the complex exponential is not one-to-one, because for example $e^0 = e^{2\pi i} = 1$.  But we still define the complex logarithm as the “inverse” of the complex exponential.  What this really means is that the complex logarithm is strictly speaking not a function, because it is not well-defined. Recall that the definition of one-to-one means that $f(x) = f(y)$ implies $x = y$, and that the definition of well-defined is that $x = y$ implies $f(x) = f(y)$.  It is clear to see here that $f$ being one-to-one is the same as $f^{-1}$ being well-defined and visa-versa ($f^{-1}$ here is the same loose definition of an inverse as saying that the complex logarithm is the inverse of the complex exponential).

So note that the complex logarithm is not well-defined exactly because the complex exponential is not one-to-one.  We of course fix this problem by making it well-defined, i.e., it normally is multivalued, but we pick a single value consistently (i.e., we pick a branch), so that it is well-defined.  For the remainder of this blog post, I will assume the standard choice of branch cut for the complex logarithm, i.e., the branch cut is along the negative axis, and we choose the branch where, for $x > 0$, $\log(x)$ is real and $\log(-x) = \log(x) + i\pi$.

My point here is that we automatically know that the complex logarithm is one-to-one because we know that the complex exponential is well-defined.

So our question boils down to, when does the identity $\log{\left (z^a\right)} = a \log(z)$ hold?  In SymPy, this identity is only applied by expand_log() or logcombine() when $a$ is real and $z$ is positive, so let us assume that we know that it holds under those conditions. Note that it also holds for some other values too.  For example, by our definition $\log{\left (e^{i\pi}\right)} = \log(-1) = \log(1) + i\pi = i\pi = i\pi\log(e)$.  For our example, this means that $x = e$, $y = -1$ is a non-trivial solution (non-trivial meaning $x \neq y$).   Actually, the way that the complex logarithm being the “inverse” of the complex exponential works is that $e^{\log(x)} = x$ for all $x$ (on the other hand $\log{\left(e^x\right)} \neq x$ in general), so that if $x = e$, then $x^{\log(y)} = e^{\log(y)} = y$ and $y^{\log(x)} = y^{\log(e)} = y^1 = y$.  In other words, $x = e$ is always a solution, for any $y\, (\neq 0)$ (and similarly $y = e$ for all $x$).  In terms of our question of when $\log{\left(z^a\right)} = a\log(z)$, this just says that this always true for $a = \log(e) = 1$, regardless of $z$, which is obvious.  We can also notice that this identity always holds for $a = 0$, regardless of $z$. In terms of our original equation, this means that $x = e^0 = 1$ is a solution for all $y$ (and as before, $y = 1$ for all $x$).

Note that $z > 0$ and $a$ real corresponds to $x, y > 0$ and $\log(x), \log(y)$ real, respectively, (which are the same condition).  So we have so far that the following are solutions to $x^{\log(y)} = y^{\log(x)}$:

• $x, y > 0$
• $x = y$
• $x = e$, $y$ arbitrary
• $y = e$, $x$ arbitrary
• $x = 1$, $y$ arbitrary
• $y = 1$, $x$ arbitrary

Now let’s look at some cases where $\log{\left (z^a\right)} \neq a\log(z)$.  If $z < 0$ and $a$ is a nonzero even integer, then $z^a > 0$ so $\log{\left (z^a \right)}) = \log{\left (\left (-z\right )^a \right )} = a\log(-z)$, whereas $a\log(z) = a(\log(-z) + i\pi)$, which are different by our assumption that $a \neq 0$.  If $a$ is an odd integer not equal to 1, then $z^a < 0$, so $\log{\left (z^a \right)} = \log{\left (-z^a \right )} + i\pi$ = $latex \log{\left (\left(- z\right)^{a} \right )} + i\pi$ WordPress is refusing to render this. It should be log((-z)^a) + iπ = $a\log(-z) + i\pi$, whereas $a\log(z) = a(\log(-z) + i\pi)$ again, which is not the same because $a \neq 1$. This means that if we let $x < 0$ and $y = e^a$, where $a \neq 0, 1$, we get a non-solution (and the same if we swap $x$ and $y$).

This is as far as I got tonight. WordPress is arbitrarily not rendering that LaTeX for no good reason. That and the very ugly LaTeX images is pissing me off (why wordpress.com hasn't switched to MathJaX yet is beyond me). The next time I get some free time, I am going to seriously consider switching my blog to something hosted on GitHub, probably using the IPython notebook. I welcome any hints people can give me on that, especially concerning migrating pages from this blog.

Here is some work on finding the rest of the solutions: the general definition of $\log(x)$ is $\log(|x|) + i\arg(x)$, where $\arg(x)$ is chosen in $(-\pi, \pi]$. Therefore, if $\log{\left(z^a\right )} = a\log(z)$, we must have $\arg(z^a) = a\arg(z)$. I believe a description of all such complex $z$ and $a$ will give all solutions $x = z$, $y = e^a$ (and $y = z$, $x = e^a$) to $x^{\log(y)} = y^{\log(x)}$. I need to verify that, though, and I also need to think about how to describe such $z$ and $a$. I will (hopefully) continue this post later, either by editing this one or writing a new one (depending on how much more I come up with).

Any comments to this post are welcome. I know you can't preview comments, but if you want to use math, just write it as $latex math$ (like $latex \log(x)$ for $\log(x)$). If you mess something up, I’ll edit your comment and fix it.

## Tip for debugging SymPy with PuDB

January 28, 2013

Usually, when I debug SymPy code with PuDB, I create a script that calls the code, then I put a

import pudb; pudb.set_trace()


in the SymPy library code where I want to start debugging. But this is annoying, first because I have to create the script, and second, because I have to modify the library code, and there’s always the risk of accidentally commiting that. Also, if I want to start debugging somewhere else, I have to edit the files and change it.

Well, I just figured out a better way. First, if you haven’t already, add an alias like this in your bash config file (~/.profile or ~/.bashrc):alias pudb='python -m pudb.run. As of this pull request, this is no longer necessary. A pudb script is installed automatically with PuDB.

This will let you run pudb script.py to debug script.py. Next, start PuDB. It doesn’t matter with what. You can just run touch test.py, and then pudb test.py. It occured to me that you can just set the breakpoint when starting isympy with PuDB.

Now, press m, and navigate to where in the library code you want to start debugging. It also helps to use / to search the current file and L to jump to a specific line. When you get to the line where you want to start debugging, press b to set a breakpoint. You can do this in multiple places if you want.

Now, you just have to start isympy from within PuDB. Just run pudb bin/isympy, and immediately press c to jump to the interactive prompt. Now, run whatever code you want to debug. When it gets to the breakpoint, PuDB will open, and you can start debugging. If you type c to continue, it will go back to isympy. But the next time you run something that hits the breakpoint, it will open PuDB again.

This trick works because breakpoints are saved to file (at ~/.config/pudb/saved-breakpoints). In fact, if you want, you can just modify that file in the first step. You can edit your saved breakpoints in the bottom right pane of PuDB.

When you are done and you type Ctrl-D PuDB will pop-up again, asking if you want to quit. That’s because it was running the whole time, underneath isympy. Just press q. Note that you should avoid pressing q while debugging, or else PuDB will quit, and you will be left with just normal isympy (it won’t break at your breakpoints any more). Actually, if you do this, but doing Ctrl-D still opens the PuDB prompt, you can just press “Restart”, and it should start working again. Note that “Restart” will not actually reset isympy: all your saved variables will still be the same, and any changes to the library code will not be reloaded. To do that, you have to completely exit and start over again.

Of course, there is nothing SymPy specific about this trick. As long as you have a script that acts as an entry point to an interactive console for your application, you can use it. If you just use IPython, you can use something like pudb /bin/ipython (replace /bin/ipython with the output of which ipython).

## Emacs: One year later

January 1, 2013

As readers of this blog may remember, back in 2011, I decided to move to a command-line based editor. For roughly two weeks in December, 2011, I exclusively used Vim, and for the same amount of time in January, 2012, I used exclusively Emacs. I had used a little of each editor in the past, but this was my first time using them to do true editing work. My experiences are chronicled in my blog posts (parts 1, 2, 3, and 7 months later follow up).

To summarize, I decided to use Emacs, as I found it to be much more intuitive, and much more user-friendly. Today, January 1, marks the one-year point of my using Emacs as my sole text editor, with some exceptions (notably, I’m currently writing this blog post in the browser). So I’d like to make some observations:

• Either one of these editors (Vim or Emacs) is going to really suck unless you are willing to make a serious investment in customizing them and installing nice addons. For the second point, Emacs has an advantage, because the philosophy of Vim is to be barebones whereas the philosophy of Emacs is to be featureful, so that in particular many things that were once addons of Emacs are now included in the standard installation. For customization, on the one hand, Emacs is easier, because it has a nice interface (M-x customize), but on the other hand, Vim’s scripting language is much easier to hack on than Emacs lisp (I still can’t code in Lisp to save my life; it’s a very challenging programming language).

But my point here is that neither has really great defaults. For example, in Emacs, M-space is bound to just-one-space, which is great for programming. What it does is remove all spaces around the cursor, except for one. But to be really useful, it also should include newlines. It doesn’t do this by default. Rather, you have to call it with a negative argument. So to be really useful, you have to add

(defun just-one-space-with-newline ()
"Call just-one-space with a negative argument"
(interactive)
(just-one-space -1))

(global-set-key (kbd "M-SPC") 'just-one-space-with-newline)


to your .emacs file.

• Emacs has great features, but I always have to look them up. Or rather, I have to look up the keyboard shortcuts for them. I only have the keyboard shortcuts memorized for the things I do every day. I even ended up forgetting really important ones, like M-w (Emacs version of copy). And if a feature involves several keystrokes to access, forget about it (for example, rectangular selection, or any features of special modes). If I use a new mode, e.g., for some file type that I rarely edit (like HTML), I might as well not have any of the features, other than the syntax highlighting, because I either don’t know what they are, or even if I know that they should exist (like automatic tag completion for html), I have no idea how to access them.

There’s really something to be said about GUI editors, which give these things to users in a way that they don’t have to memorize anything. Perhaps I should try to use the menu more. Or maybe authors of addons should aim to make features require as little cognitive user interaction as possible (such as the excellent auto-complete-mode I mentioned in part 3).

I mention this because it is one of the things I complained about with Vim, that the keybindings were too hard to memorize. Of course, the difference with Vim is that one has to memorize keybindings to do even the most basic of editing tasks, whereas with Emacs one can always fall back to more natural things like Shift-Arrow Key to select text or Delete to delete the character under the cursor (and yes, I know you can rebind this stuff in Vim; I refer you to the previous bullet point).

• I mentioned at the end of part 3 that Vim might still be useful to learn, as vi is available literally anywhere that you have POSIX. I honestly don’t think I would be able to use vi or vim if I had to, customization or no, unless I had my keyboard cheat sheet and a decent amount of time. If I’m stuck on a barebones system and I can’t do anything about it, I’ll use nano/pico before I use vi. It’s not that I hate vi. I just can’t do anything with it. It is the same to me now as it was before I used it in-depth. I have forgotten all the keyboard shortcuts, except for ESC and i.
• I don’t use emacsclient any more. Ever since I got my new retina MacBook Pro, I don’t need it any more, because with the solid state drive starting Emacs from scratch is instantaneous. I’m glad to get rid of it, because it had some seriously annoying glitches.
• Add alias e=emacs to your Bash config file (.profile or .bashrc). It makes life much easier. “emacs” is not an easy word to type, at least on QWERTY keyboards.
• I still feel like I am not nearly as efficient in Emacs as I could be. On the one hand, I know there are built-in features (like rectangular selection) that I do not take advantage of enough. I have been a bit lazy with customization: there are a handful of things that I do often that require several keystrokes, but I still haven’t created custom keyboard shortcuts for (off the top of my head: copying and pasting to/from the Mac OS X clipboard and rigidly indenting/dedenting a block of text (C-u 4 C-x TAB, actually C-c u 4 C-x TAB, since I did the sensible thing and rebound C-u to clear to the previous newline, and bound universal-argument to C-c u) come to mind).

I feel as if I were to watch someone who has used Emacs for a long time that I would learn a lot of tricks.

• I really should learn Emacs lisp. There are a lot of little customizations that I would like to make, but they are really niche, and can only be done programmatically. But who has the time to learn a completely new programming language (plus a whole library, as just knowing Lisp is useless if you don’t know the proper Emacs funtions and variables and coding styles)?
• I’ve still not found a good visual browser for jumping to function definitions in a file (mostly Python function definitions, but also other kinds of headers for other kinds of files). The best I’ve found is imenu. If you know of anything, please let me know. One thing I really liked about Vim was the tag list extension, which did this perfectly (thanks to commenter Scott for pointing it out to me). I’ve been told that Cedet has something like this, but every time I try to install it, I run into some issues that just seem like way too much work (I don’t remember what they are, it won’t compile or something, or maybe it just wants to do just way too much and I can’t figure out how to disable everything except for the parts I want).
• If you ever code in C, add the following to your Makefile
check-syntax:
$(CC) -o nul$(FLAGS) -S \$(CHK_SOURCES)


(and if you don’t use a Makefile, start using one now). This is assuming you have CC and FLAGS defined at the top (generally to something like cc and -Wall, respectively). Also, add the following to your .emacs

;; ===== Turn on flymake-mode ====

(defun turn-on-flymake ()
"Force flymake-mode on. For use in hooks."
(interactive)
(flymake-mode 1))

(defun flymake-keyboard-shortcuts ()
"Add keyboard shortcuts for flymake goto next/prev error."
(interactive)
(local-set-key "\M-n" 'flymake-goto-next-error)
(local-set-key "\M-p" 'flymake-goto-prev-error))


The last part adds the useful keyboard shortcuts M-n and M-p to move between errors. Now, errors in your C code will show up automatically as you type. If you use the command line version of emacs like I do, and not the GUI version, you’ll also need to install the flymake-cursor module, which makes the errors show up in the mode line, since otherwise it tries to use mouse popups. You can change the colors using M-x customize-face (search for “flymake”).

• I never got flymake to work with LaTeX. Does anyone know how to do it? It seems it is hardcoded to use MikTeX, the Windows version of LaTeX. I found some stuff, but none of it worked.

Actually, what I really would like is not syntax checking (I rarely make syntax mistakes in LaTeX any more), but rather something that automatically builds the PDF constantly as I type. That way, I can just look over at the PDF as I am writing (I use an external monitor for this. I highly recommend it if you use LaTeX, especially one of those monitors that swivels to portrait mode).

• If you use Mac OS X, you can use the very excellent KeyRemap4MacBook program to make regular Mac OS X programs act more like Emacs. Mac OS X already has many Emacs shortcuts built in (like C-a, C-e, etc.), but that only works in Cocoa apps, and it doesn’t include any meta key shortcuts. This lets you use additional shortcuts literally everywhere (don’t worry, it automatically doesn’t use them in the Terminal), including an emulator for C-space and some C-x commands (like C-x C-s to Command-s). It doesn’t work on context sensitive shortcuts, unfortunately, unless the operating system already supports it with another keyboard shortcut (e.g., it can map M-f to Option-right arrow). For example, it can’t enable moving between paragraphs with C-S-{ and C-S-}. If anyone knows how to do that, let me know.
• For about a month this summer, I had to use a Linux laptop, because my Mac broke and my new Mac took a month to arrive (the downside to ordering a new computer immediately after it is announced by Apple). At this point, my saving of all my customizations to GitHub really helped a lot. I created a new branch for the Linux computer (because several things in my customizations were Mac specific), and just symlinked the files I wanted. A hint I can give to people using Linux is to use Konsole. The Gnome terminal sucks. One thing I never figured out is how to make Konsole (or any other Terminal for that matter) to send Control-Shift shortcuts to Emacs (see http://superuser.com/q/439961/39697). I don’t use Linux any more at the moment, but if anyone knows what was going on there, add an answer to that question.
• In part 3 mentioned that predictive mode was cool, but not very useful. What it does is basically add tab completion for every word in the English language. Actually, I’ve found using auto-complete-mode even when editing text (or LaTeX) to be very useful. Unlike predictive mode, it only guesses words that you’ve already typed (it turns out that you tend to type the same words over and over, and doubly so if those words are LaTeX math commands). Also, predictive mode has a set order of words, which supposedly helps to use it with muscle memory, whereas auto-complete-mode tries to learn what words you are more likely to use based on some basic statistical machine-learning. Also, auto-complete-mode has a much better visual UI and smarter defaults than predictive mode. The result is that it’s actually quite useful and makes typing plain text, as well as LaTeX (actually, pretty much anything, as long as you tend to use the same words repeatedly) much faster. I recommend enabling auto-complete-mode almost everywhere using hooks, like
(add-hook 'latex-mode-hook 'auto-complete-mode)
;; etc.

• At the end of the day, I’m pretty happy with Emacs. I’ve managed to fix most of the things that make it annoying, and it is orders of magnitude more powerful than any GUI editor or IDE I’ve ever seen, especially at just basic text editing, which is the most important thing (I can always use another program for other things, like debugging or whatever). The editor uses the basic shortcuts that I am used to, and is quite efficient to write in. Extensions like auto-complete-mode make using it much faster, though I could use some more extensions to make it even better (namely, a better isearch and a better imenu). Regarding Vim vs. Emacs, I’d like to quote something I said back in my first blog post about Vim over a year ago:

Vim is great for text editing, but not so hot for text writing (unless you always write text perfectly, so that you never need to leave insert mode until you are done typing). Just the simple act of deleting a mistyped word (yes, word, that happens a lot when you are decently fast touch typist) takes several keystrokes, when it should in my opinion only take one (two if you count the meta-key).

Needless to say, I find Emacs to be great for both text editing and text writing.

• ## 2012 in review

December 30, 2012

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 20,000 views in 2012. If each view were a film, this blog would power 5 Film Festivals

## Infinitely nested lists in Python

September 19, 2012

Readers of this blog know that I sometimes like to write about some strange, unexpected, and unusual things in Python that I stumble across. This post is another one of those.

First, look at this

>>> a = []
>>> a.append(a)
>>> a
[[...]]


What am I doing here? I’m creating a list, a, and I’m adding it to itself. What you end up with is an infinitely nested list. The first interesting thing about this is that Python is smart enough to not explode when printing this list. The following should convince you that a does indeed contain itself.

>>> a[0] is a
True
>>> a[0] == a
True


Now, if you have programmed in C, or a similar language that uses pointers, this should not come as a surprise to you. Lists in Python, like most things, do not actually contain the items inside them. Rather, they contain references (in C terminology, “pointers”) to the items inside them. From this perspective, there is no issue at all with a containing a pointer to itself.

The first thing I wondered when I saw this was just how clever the printer was at noticing that the list was infinitely nested. What if we make the cycle a little more complex?

>>> a = []
>>> b = []
>>> a.append(b)
>>> b.append(a)
>>> a
[[[...]]]
>>> b
[[[...]]]
>>> a[0] is b
True
>>> b[0] is a
True


So it still works. I had thought that maybe repr just catches RuntimeError and falls back to printing ... when the list is nested too deeply, but it turns out that is not true:

>>> a = []
>>> for i in range(10000):
...     a = [a]
...
>>> a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded while getting the repr of a list


And by the way, in case you were wondering, it is possible to catch a RuntimeError (using the same a as the previous code block)

>>> try:
...     print(a)
... except RuntimeError:
...     print("no way")
...
no way


(and you also may notice that this is Python 3. Things behave the same way in Python 2)

Back to infinitely nested lists, we saw that printing works, but there are some things that don’t work.

>>> a[0] == b
True
>>> a[0] == a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in comparison


a[0] is b holds (i.e., they are exactly the same object in memory), so == is able to short-circuit on them. But to test a[0] == a it has to recursively compare the elements of a and a[0]. Since it is infinitely nested, this leads to a recursion error. Now an interesting question: why does this happen? Is it because == on lists uses a depth first search? If it were somehow possible to compare these two objects, would they be equal?

One is reminded of Russel’s paradox, and the reason why in axiomatic set theory, sets are not allowed to contain themselves.

Thinking of this brought me to my final question. Is it possible to make a Python set that contains itself? The answer is obviously no, because set objects can only contain hashable objects, and set is not hashable. But frozenset, set‘s counterpart, is hashable. So can you create a frozenset that contains itself? The same for tuple. The method I used for a above won’t work, because a must be mutable to append it to itself.

## isympy -I: A saner interactive environment

August 31, 2012

As promised, here is another post describing a new feature in the upcoming SymPy 0.7.2.

## Automatic Symbol Definition

While not as ground breaking as the feature I described in my last post, this feature is still quite useful. As you may know, SymPy is inherently a Python library, meaning that it lives by the rules of Python. If you want to use any name, whether it be a Symbol or a function (like cos), you need to define it (in the case of Symbols), or import it (in the case of functions that come with SymPy). We provide the script isympy with SymPy to assist with this. This script automatically runs IPython (if it’s installed), imports all names from sympy (from sympy import *), and defines common symbol names (like x, y, and z).

But if you want to use a Symbol that is not one of the ones predefined by isympy, you will get something like

In [1]: r*x
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
in ()
----> 1 r*x

NameError: name 'r' is not defined


The best solution for this has been either to type var('r'), which will create the Symbol r and inject it into the namespace, or to wrap your text in a string and pass it to sympify(), like sympify("r*x"). Neither of these are very friendly in interactive mode.

In SymPy 0.7.2, isympy has a new command line option, isympy -a, which will enable a mechanism that will automatically define all undefined names as Symbols for you:

In [1]: r*x
Out[1]: r⋅x


There are some caveats to be aware of when using this feature:

• Names must be undefined for isympy -a to work. If you type something like S*x, you’ll get:
In [3]: S*x
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-6656a97ea7b0> in <module>()
----> 1 S*x

TypeError: unsupported operand type(s) for *: 'SingletonRegistry' and 'Symbol'


That’s because S is already defined (it’s the SingletonRegistry, and also a shortcut to sympify()). To use a name that’s already defined, either create it manually with var() or delete it using del.

• This only works on the top level namespace. If you define a function with an undefined name, it will not automatically define that symbol when run.
• This works by catching NameError, defining the name, and then re-running the expression. If you have a multiline statement, any lines before the undefined name will be run before the NameError will be caught. This usually won’t happen, but it’s a potential side-effect to be aware of. We plan to rewrite it using either ast or tokenize to avoid this issue.
• Obviously, this is intended for interactive use only. If you copy code and put it in a script, or in some other place where someone might be expected to run it, but not necessarily from isympy -a, you should include symbol definitions.

## Automatic int to Integer Conversion

A second thing that is annoying with Python and SymPy is that something like 1/2 will be interpreted completely by Python, without any SymPy. This means that something like 1/2 + x will give either 0 + x or 0.5 + x, depending on whether or not __future__.division has been imported. isympy has always ran from __future__ import division, so that you’ll get the latter, but we usually would prefer to get Rational(1, 2). Previously, the best way to do this was again to either run it through sympify() as a string, or to sympify at least one of the numbers (here the S() shortcut to sympify() is useful, because you can type just S(1)/2).

With SymPy 0.7.2, you can run isympy -i, and it will automatically wrap all integers literals with Integer(). The result is that 1/2 produces Rational(1, 2):

In [1]: 1/2 + x
Out[1]: x + 1/2


Again, there are a couple of caveats:

• If you want to get Python style division, you just need to wrap both arguments in int():
In [2]: int(1)/int(2)
Out[2]: 0.5


Of course, if you just want a floating point number, you can just use N() or .evalf()

• This works by parsing the text and wrapping all integer literals with Integer(). This means that if you have a variable set to a Python int, it will still act like a Python int:
In [6]: a = int(1)

In [7]: b = int(2)

In [8]: a/b
Out[8]: 0.5


Note that to even do that example, I had to manually make a and b Python ints by wrapping them in int(). If I had just done a = 1, it would have been parsed as a = Integer(1), and I would have gotten a SymPy Integer. But this can be an issue if you use the result of some function that returns an int (again, note that most functions in SymPy that return integers return Integer, not int).

• The same as before: this will only work interactively. If you want to reuse your code outside of isympy -i, you should take care of any int/int by rewriting it as S(int)/int.

Since these are both useful features, we’ve added a way that you can get them both at once: by doing isympy -I (the “I” stands for “Interactive”). If we add similar features in the future, we will also add them to the -I shortcut (for example, we may add an option to allow ^ to automatically be replaced with **).

## SymPy Live Sphinx Extension

August 21, 2012

I didn’t blog about SymPy all summer, so I thought I would write a post about my favorite feature of the upcoming SymPy 0.7.2 release.  In fact, this feature has got me more excited than any other feature from any version of SymPy.  Yeah, it’s that good.

The feature is the SymPy Live Sphinx extension.  To start, if you don’t know about it, check out SymPy Live.  This is a console that runs on the App Engine.  We’ve actually had this for quite some time, but this winter, it got a huge upgrade thanks to the contribution of some GCI students.  Basically, SymPy Live lets you try out SymPy in your browser completely for free, because it runs all the code on the App Engine.  Actually, the console is a full Python console, so you can actually run any valid Python command on it.  This past winter, GCI students upgraded the look of the site, added a mobile version (visit live.sympy.org on your phone), and added other neat features like search history and autocompletion.

Now, Sphinx is the documentation system that we use to generate SymPy’s html documentation. Last year, when I was at the SciPy Conference, Mateusz had an idea at the sprints to create an extension linking SymPy Live and Sphinx, so that the examples in Sphinx could be easily run in SymPy Live.  He didn’t finish the extension, but I’m happy to report that thanks to David Li, who was also one of the aforementioned GCI students, the extension is now complete, and is running live on our development docs.  When SymPy 0.7.2 is released (soon I promise), it will be part of the oficial documentation.

The best way to see how awesome this is is to visit the website and check it out.  You will need a modern browser (the latest version of Firefox, Safari, or Chrome will work, IE might work too).  Go to a page in the development docs with documentation examples, for example, http://docs.sympy.org/dev/tutorial.html#algebra, and click on one of the examples (or click on one of the green “Run code block in SymPy Live” buttons). You should see a console pop up from the bottom-right of the screen, and run your code.  For example:

Example of the SymPy Live Sphinx extension at http://docs.sympy.org/dev/tutorial.html#algebra. Click for larger image.

You can access or hide the console at any time by clicking on the green box at the bottom-right of the page.  If you click on “Settings”, you will see that you can change all the same settings as the regular SymPy Live console, such as the printer type, and the keys for execution and autocompletion.  Additionally, there is a new setting, “Evaluation Mode”, which changes how the Sphinx examples are evaluated.  The default is “Evaluate”.  In this mode, if you click on an example, it is executed immediately.  The other option is “Copy”.  In this mode, if you click an example, it is copied to the console, but not executed right away. This way, you can edit the code to try something different.  Remember, this is a full fledged Python console running SymPy, so you can try literally anything

So play with this and let us know what you think.  We would love to hear ways that we can improve the experience even further.  In particular, I think we should think about ways to make the “Copy” mode more user-friendly.  Suggestions welcome!  Also, please report any bugs.

And one word of warning:  even though these are the development docs, SymPy Live is still running SymPy 0.7.1.  So some examples may not work until 0.7.2 is released, at which point we will update SymPy Live.

I believe that this extension represents the future of interactive documentation. I hope you enjoy.

## Emacs: 7 months later

July 9, 2012

In my final post about my switching to Emacs, a commenter, Scott, asked me, “It has been a while since you started using Emacs. I’m just curious. How is your experience so far now that you have more experience and a more complete configuration?” My reply was getting quite long, so I figured it would be best suited as a new post.

The short answer is, mostly the same since I wrote that Vim vs. Emacs (part 3). Once you use something a lot, you notice all kinds of things that could use improvements. Some of them are just minor annoyances. For example, many interactive commands in Emacs (but not all!) require you to type out “yes” instead of just “y” as a confirmation. Others are more serious, like the need for a real replacement of SuperTab from vim.

I actually didn’t have much free time to work on configuring Emacs during the school year, and once the summer started, my computer died, and I’ve been working of an old laptop running Linux until I can get a new one. Fortunately, I had the foresight to put all my Emacs configuration online on GitHub, so it was easy to get my configuration again. I’ve noticed that in Linux, the Alt key (i.e., Meta) is used for other things, so it doesn’t work so well in Emacs (e.g., pressing Alt without any other keys sometimes activates a menu that removes the keyboard focus, and also C-M shortcuts don’t seem to work at all).

I’ve memorized very few keyboard shortcuts, even ones that might be useful to me (e.g., I don’t remember the shortcut to jump to a matching parenthesis). Usually, if I am using some mode or something and I want to know how to do something, I just Google it, and generally find the answer within a few seconds.

There are several major configuration issues that I’ve yet to address, either due to lack of time or because I couldn’t find a suitable solution. A SuperTab replacement is one. This is actually a big one, because scrolling through a file just to see what’s there is getting older and older, as is searching just to jump to a function definition. If anyone knows of a good way to do this, please let me know. I mainly need it for Python files, but having it other modes as well would be nice. Basically, I just want something that shows me all the class and function definitions in the file, in order, that I can easily select one and jump to it.

Related to searching, searching in Emacs sucks. I’m using isearch+, which is an improvement, but it still bugs me that search does not wrap around by default. Also, for some reason, pressing delete doesn’t delete the last character you typed, but the last character that it matched. That may sound minor, but I use it a lot, so it’s really gotten on my nerves.

Regular expression searching in Emacs is useless. I can never get it to work (usually because of differences between () and ). What I really want is an interactive, user friendly, regular expression search/search and replace tool. There’s regexp-builder, but that’s useless because once you build the regular expression, you have to manually copy it and paste it into the real regular expression search function to actually use it. And it doesn’t work with search and replace.

This last semester I had a semester long project in C. For that, flymake-mode was a godsend. It requires a bit of manual configuration (you have to add something to your Makefile, and you have to add some stuff to .emacs as always to enable it by default), but once you do that, it just works. If you don’t know what this is, basically, it highlights the compiler errors in your source in real time, as you type it. So instead of doing something stupid twenty times, and then compiling and finding them all, you do something stupid once, see the error, and don’t do make the mistake any more. It’s also nice to close your editor and know that your code will compile.

The Python mode I am mixed about. On the one hand, it’s really awesome how smart it is about indentation. On the other hand, the syntax highlighting is just shy of what I want (granted, it’s pretty good, but I want better than that). For example, I want to be able to color docstrings, single quoted strings, and double quoted strings differently. It would also be awesome to get some coloring in docstrings itself. I’m thinking markdown mode for any text that’s in a docstring, except for doctests, which are colored in Python mode (or some variant).

Some things I’ve not really cared much about yet because I haven’t used that type of file yet. For example, I’m currently writing this post in Emacs, and just now noticing the deficiencies in html-mode (e.g., I want an easy way to select text and turn it into a link, just like in the WordPress editor).

Finally, I’ve been trying to write my own theme. That process has been slow and slightly painful. Emacs is currently in the process of moving to themes, though, so this is to be expected. When Emacs 24 is actually released I think it will be fair to judge how well this feature works.

That’s my wishlist (or most of it anyway). But there are positive things too. auto-complete-mode, which I mentioned at the top of my previous blog post, is absolutely awesome. I think this extension alone has made me more productive.

Some things I take for granted, like automatic spell checking of strings and comments in Python (not enabled by default, but not hard to configure either). Thanks to someone on an Emacs mailing list, I have the perfect automatic clearing of trailing whitespace, that automatically leaves your whitespace before the cursor in the buffer, but still writes the clear to the file (see my .emacs file from my dotfiles repo linked to above for details).

I’ve been hoping to learn Emacs lisp, so that I could remedy many of these problems on my own, but so far I haven’t really had the free time. Lisp is a very confusing language, so it’s not easy to jump into (compared to the language vim uses, which I found easy enough to hack on without knowing at all).

Ultimately, I’m quite pleased with how user friendly Emacs is, and with how easy it is to find out how to do almost anything I want just by Googling it. Configuration is an uphill battle. Emacs has a ton of great packages, many of which are included, but almost none are enabled by default. Just today I discovered Ido mode, thanks to David Li. I feel that in the long term, as I learn Emacs Lisp, I can make it do whatever I want. It provides a good baseline editing experience, and a good framework for configuring it to do whatever you want, and also enough people use it that 99% of the things you want are already done by somebody.

## How to install the development version of IPython Qtconsole and Notebook in Ubuntu

June 14, 2012

Both the awesome IPython notebook and Qtconsole are in the Ubuntu repositories, so if you just want to use the stable released versions, you can just do

sudo apt-get install ipython-notebook ipython-qtconsole


and be on your way. But the git development version has a lot of cool new features, and you may not want to wait for 0.13 to be released and make its way to the Ubuntu repos. But you may be thinking that to use those you will have to figure out all the dependencies yourself. Actually, it’s pretty easy:

# First install git, if you don't already have it
sudo apt-get install git
# Then, clone the IPython repo, if you haven't already.
git clone git://github.com/ipython/ipython.git
cd ipython
# Now just install IPython with apt, then uninstall it.  The dependencies will remain
sudo apt-get install ipython-notebook ipython-qtconsole
sudo apt-get remove ipython-notebook ipython -qtconsole ipython
# Now install the IPython git version in such a way that will keep up to date when you pull
sudo python setup.py develop


To update, just cd into that ipython directory and type git pull. That’s it. Now type ipython notebook or ipython qtconsole to get the magic.

EDIT: After you do this, apt-get will start bugging you every time that you use it that a bunch of packages are no longer needed. These are the ones that you do need for the qtconsole and the notebook, so you should not autoremove them as it says. Rather, set them as manually installed by copying the list of packages that it tells you about and sudo apt-get installing them.