sqrt(x) now prints as "sqrt(x)"

August 18, 2011

Just a few moments ago, a branch was pushed in that fixed one of my biggest grievances in SymPy, if not the biggest. Previously we had this behavior:

In [1]: sqrt(x)
Out[1]: x**(1/2)

In [2]: solve(x**2 - 2, x)
Out[2]: [-2**(1/2), 2**(1/2)]

Now suppose you took the output of those expressions and pasted them into isympy:

In [3]: x**(1/2)
Out[3]: x**0.5

In [4]: [-2**(1/2), 2**(1/2)]
Out[4]: [-1.41421356237, 1.41421356237]

That’s with __future__.division. Here’s what would happen with old division:

In [2]: x**(1/2)
Out[2]: 1

In [3]: [-2**(1/2), 2**(1/2)]
Out[3]: [-1, 1]

This is because with old division, 1/2 evaluates to 0.

The problem is that Python evaluates 1/2 to 0.5 (or 0) before SymPy has a change to convert it to a Rational. There were several ways that people got around this. If you copy an expression with number division in it and want to paste it into a SymPy session, the easiest way to do this was to pass it as a string to sympify():

In [1]: sympify("x**(1/2)")
Out[1]: x**(1/2)

In [2]: sympify("[-2**(1/2), 2**(1/2)]")
Out[2]: [-2**(1/2), 2**(1/2)]

If that was too much typing for you, you could use the S() shortcut to sympify()

In [3]: S("x**(1/2)")
Out[3]: x**(1/2)

In [4]: S("[-2**(1/2), 2**(1/2)]")
Out[4]: [-2**(1/2), 2**(1/2)]

This solution is fine if you want to paste an expression into a SymPy session, but it’s not a very clean one if you want to paste code into a script. For that, you need to modify the code so that it no longer contains Python int/Python int. The easiest way to do this is to sympify one of the ints. So you would do something like

In [5]: x**(S(1)/2)
Out[5]: x**(1/2)

In [6]: [-2**(S(1)/2), 2**(S(1)/2)]
Out[6]: [-2**(1/2), 2**(1/2)]

This wasn’t terribly readable, though. The best way to fix the problem when you had a power of one half was to use sqrt(), which is a shortcut to Pow(…, Rational(1, 2)).

Well, this last item should make you think. If sqrt(x) is more readable than x**(S(1)/2) or even x**(1/2), why not print it like that in the first place. Well, I thought so, so I changed the string printer, and now this is the way that SymPy works. So 90% of the time, you can just paste the result of str() or print, and it will just work, because there won’t be any **(1/2), which was by far the most common problem of “Python evaluating the expression to something before we can.” In the git master, SymPy now behaves like

In [1]: sqrt(x)
Out[1]: sqrt(x)

In [2]: solve(x**2 - 2, x)
Out[2]: [-sqrt(2), sqrt(2)]

You can obviously just copy and paste these results, and you get the exact same thing back. Not only does this make expressions more copy-and-pastable, but the output is much nicer in terms of readability. Here are some before and afters that come from actual SymPy doctests that I had to change after fixing the printer:

Before:
>>> e = ((2+2*sqrt(2))*x+(2+sqrt(8))*y)/(2+sqrt(2))
>>> radsimp(e)
2**(1/2)*x + 2**(1/2)*y

After:
>>> radsimp(e)
sqrt(2)*x + sqrt(2)*y
Before:
>>> b = besselj(n, z)
>>> b.rewrite(jn)
2**(1/2)*z**(1/2)*jn(n - 1/2, z)/pi**(1/2)

After:
>>> b.rewrite(jn)
sqrt(2)*sqrt(z)*jn(n - 1/2, z)/sqrt(pi)
Before:
>>> x = sympify('-1/(-3/2+(1/2)*sqrt(5))*sqrt(3/2-1/2*sqrt(5))')
>>> x
(3/2 - 5**(1/2)/2)**(-1/2)

After
>>> x
1/sqrt(3/2 - sqrt(5)/2)

And not only is sqrt(x) easier to read than x**(1/2) but it’s fewer characters.

In the course of changing this, I went ahead and did some greps of the repository to get rid of all **(S(1)/2), **Rational(1, 2) and similar throughout the code base (not just in the output of doctests where the change had to be made), replacing them with just sqrt. Big thanks to Chris Smith for helping me catch all instances of this. Now the code should be a little easier to read and maintain.

Future Work

This is a big change, and I believe it will fix the copy-paste problem for 90% of expressions. But it does not solve it completely. It is still possible to get int/int in the string form of an expression. Only powers of 1/2 and -1/2 are converted to sqrt, so any other rational power will still print as a/b, like

In [1]: x**Rational(3, 2)
Out[1]: x**(3/2)

Also, as you may have noticed in the last example above, a rational number that sits by itself will still be printed as int/int, like

In [2]: (1 + x)/2
Out[2]: x/2 + 1/2

Therefore, I’m leaving the issue for this open to discuss potential future fixes to the string printer. One idea is to create a root function that is a shortcut to root(x, a) == x**(1/a). This would work for rational powers where the numerator is 1. For other rational powers, we could then denest these with an integer power. It’s important to do this in the right order, though, as they are not equivalent. You can see that SymPy auto-simplifies it when it is mathematically correct in all cases, and not when it is not:

In [3]: sqrt(x**3)
Out[3]: sqrt(x**3)

In [4]: sqrt(x)**3
Out[4]: x**(3/2)

Thus \left(\sqrt{x}\right)^3 = x^{\frac{3}{2}} but \sqrt{x^3} \neq x^{\frac{3}{2}} (to see this, replace x with -1).

So the idea would be to print Pow(expr, Rational(a, b)) as root(expr, b)**a.

The merits of this are debatable, but anyway I think we should have this root() function in any case (see issue 2643).

Another idea, which is probably not a good one, is to always print int/int as S(int)/int. So we would get

>>> Rational(1, 2)
S(1)/2
>>> x**Rational(4, 5)
x**(S(4)/5)

This is probably a bad idea because even though expressions would always be copy-pastable, they would be slightly less readable.

By the way, in case you didn’t catch it, all of these changes only affect the string printer. The pretty printer remained unaffected, and would under any additional changes, as it isn’t copy-pastable anyway, and already does a superb job of printing roots.


Hacking PuDB: Now an even better Python debugger

August 8, 2011

Readers of this blog may remember last year when I wrote about this awesome visual console Python debugger called PuDB. I suggest you read that post if you haven’t.

At the end of that post, I noted that Ondřej and I had hacked it to make the colors more livable. Well, a couple of weeks ago, GitHub user jtriley sent me an email asking me to back port my changes.

A lot has changed since I wrote my blog post last year. PuDB now has an official mailing list and an official GitHub repo.

So I deleted my GitHub clone and reforked from the official version.

A lot has also changed in the official code. Andreas had added config support, including a built-in prefs dialog that lets you set a few settings: the ability to turn on or off line numbers and the ability to change themes.

So I took the new code and added my theme as an official theme. This was pretty straight forward to do.

But then, I got a little carried away.

I noticed that it was difficult to choose a theme with the built-in prefs window because you had to close and reopen the window each time you made a change. So I added code to make it auto-update your changes as you made them.

Then I went back and looked at my original blog post and looked at the things that I didn’t like. There were two things. First, the default stringifier for variables is type, which is completely useless. This is because type is very fast and stable to compute. I had previously hacked this to be str, but now that there was an official config file with a prefs dialog, I figured it should go there.

So I added support to change this setting. But this wasn’t enough for me. I also added the ability to define your own custom stringifier. You just create a Python file that defines a function called pudb_stringifier(obj), which converts obj into the desired string representation. I included an example file that gives a fancy example that uses signals to compute the string value, but times out after one second and falls back to the type. This alleviates one of the problems of using str, which is that it can be slow for objects with large string expressions, especially SymPy objects, where sometimes the printer can be slow.

The second thing I didn’t like was that although you can change the width of the right-hand side bar, you could not change the relative heights of the variables, stack, and breakpoints boxes. I never use breakpoints, and rarely use the stack, so I would prefer to have those smaller and the variables larger. So I implemented it so that the [ and ] keys make the selected view smaller or larger. This information is all saved in the config file, so it’s remembered when you close and reopen PuDB.

There was one other thing that I didn’t like, which a change since my last blog post that reversed the order of the stack variables from what it was. It used to be most recent at the bottom, but it was changed to most recent at the top. This perhaps makes more sense, but the buttons to move around the stack, u and d, were still the same: u moves down the stack (i.e., less recent), and d moves up. These keys were already well established—indeed, these are the same keys used in Python’s built-in debugger pdb—so I added a setting to change the stack order. This was an easy change to make at this point, as I was already well aquatinted with the settings code, and only two lines of code needed to be changed when the setting changed. Like all other settings, this uses the cool magic that changes the setting in real time, so you can see the effect without closing the settings window.

Then someone on the mailing list requested a feature that I realized I also wanted, the ability to wrap variables. Previously, any variable that was longer than the variable view would just be cut off. You could make it wider, but that only helped a little bit. Otherwise, if you wanted to see the whole variable, you had to open IPython by pressing ! and view it there.

So, I implemented this. This was definitely the hardest thing to implement. I found out that it’s ironically very difficult to debug PuDB itself. You can’t run PuDB inside of PuDB if PuDB crashes, as both instances will just crash. Also, PuDB eats any print statements. The solution, suggested by PuDB author Andreas Klöckner, was to get the ttys file of another terminal (e.g., /dev/ttys012) and write the output to that.

I also made it so that non-wrapped variables show ... at the end, at Andreas’s suggestion. I wanted to use the unicode , but this was not working at all. I discovered how much unicode really is a mess in Python 2. The problem has something to do with … being a three byte character, and I think it also has to do with the color codes that urwid uses. I’ll try it again once PuDB is ported to Python 3, but for now, we are going to have to do with the three ascii dots.

The wrapping code is waiting for merge, but the rest are already in. Here is a screen shot demonstrating some of the things I did:

Click for full size image

Things that I implemented to notice here:

– The midnight theme.
– The stack and breakpoints views have been shrunken.
– The variables are wrapped.
– Wrapping for the variable fourhundred has been turned off (you can turn wrapping on or off on a per-variable basis by selecting the variable and pressing w). Notice that there is an ellipsis at the end to note it has been cut off.
– Nested variables now have | before them, to distinguish them from wrapped variables, which are also indented. This change may or may not be accepted by Andreas.

Here’s a screen shot showing the prefs window. I did not implement this, but I did implement all but the first two preferences in the window. I’ve made my window tall so you can see all the options. You really have to get the code and try it to see the auto-update awesomeness. You can open the prefs window by pressing Ctrl-p (this was not at all obvious to me the first time I used it, so I also submitted a patch that makes it open the first time you use PuDB).

Click to see full size image

So if you’re not already using this awesome Python debugger, you should. You can pip install pudb, or fork it at GitHub.

Running it in your code is very easy. Just add

import pudb;pudb.set_trace()

in your code wherever you want to set a break point, or you can do python -m pudb.run script.py.

This awesome tool has increased my productivity tenfold since I discovered it, and has helped me track down bugs that would have otherwise extremely difficult if not impossible to find. And now, it’s just better.

PuDB uses the urwid library to do all its console GUI magic. This library makes it pretty easy to do a lot of stuff. For example, it automatically does relative sizing of widgets, so, for example, when you resize the variables, stack, or breakpoints views, you are actually increasing the relative size of each, not the size in characters. This makes it portable against any terminal size. The library also made coding the prefs window autoupdate magic very easy.

Also, I just want to note that git and GitHub make collaboration like this very easy. I just forked his project, made some improvements, and submitted them as pull requests. Then it was easy to discuss the changes. If the code had not been on GitHub and especially if it had not been in git, I probably would have never bothered to submit my contributions upstream. I highly recommend that every open source project use git and GitHub.