The Risch Algorithm: Part 1

June 30, 2010

My work this week isn’t very interesting, even insomuch as my work any week is interesting, so this week I have elected to start a series of blog posts about the Risch Algorithm in general. I will start out with the basics in this post.

Anyone who has taken Calculus knows a handful of heuristics to calculate integrals. u-substitution, partial fractions, integration by parts, trigonometric substitution, and table integration are a few of the more popular ones. These are general enough to work for most integrals that are encountered in problems in Physics, Engineering, and so on, as well as most of those generated by solving differential equations from the same fields. But these fall short in a couple of ways. First off, they are just heuristics. If they fail, it does not mean that no integral exists. This means that they are useless for proving that certain functions, such as e^{-x^2} do not have integrals, no matter how hard you try to find them. Second, they work for only relatively simple functions. For example, suppose you have a rational function in \log{x} and x. An example would be \frac{(\log{x})^2 + 2\log{x} + x^2 + 1}{x\log{x} + 2x^3}. We are not interested in integrating this function, but rather in finding it back given its derivative, - \frac{1 + 7 x^{2} \log{x} + \log{x} + (\log{x})^2 + 3 x^{2} + 6 x^{2} (\log{x})^2 + (\log{x})^3 + 2 x^{4}}{4 x^{4} \log{x} + x^{2} (\log{x})^2 + 4 x^{6}}. The only method I named above that would come even close to being applicable to this integrand is partial fractions. This requires multivariate partial fraction decomposition (with respect to x and \log{x}), and gives -{\frac {2\,{x}^{2}+1+\log{x} }{{x}^{2}}}+{\frac {-1+8\,{x}^{4}-16\,{x}^{6}-{x}^{2}}{ \left( \log{x} +2\,{x}^{2} \right) ^{2}{x}^{2}}}+{\frac {-3\,{x}^{2}+12\,{x}^{4}-1}{ \left( \log{x} +2\,{x}^{2} \right) {x}^{2}}}, which brings us no closer to a solution.

The reason that I started with a function and then computed its derivative was to show how easy it is to come up with a very complicated function that has an elementary anti-derivative. Therefore, we see that the methods from calculus are not the ones to use if we want an integration algorithm that is complete. The Risch Integration Algorithm is based on a completely different approach. At its core lies Liouville’s Theorem, which gives us the form of any elementary anti-derivative. (I wish to point out at this point that heuristics like this are still useful in a computer algebra system such as SymPy as fast preprocessors to the full integration algorithm).

The Risch Algorithm works by doing polynomial manipulations on the integrand, which is entirely deterministic (non-heuristic), and gives us the power of all the theorems of algebra, allowing us to actually prove that anti-derivatives cannot exist when they don’t. To start off, we have to look at derivations. As I said, everything with the Risch Algorithm is looked at algebraically (as opposed to analytically). The first thing to look at is the derivative itself. We define a derivation as any function D on a ring R that satisfies two properties:

1. D(a + b) = Da + Db (Sum Rule),
2. D(ab) = aDb + bDa (Product Rule)

for any a, b \in R. Furthermore, define the set of constant elements as Const_D(R) = \{a \in R\textrm{ such that }Da = 0\}. From just these two rules, you can prove all the rules from calculus such as the power rule and the quotient rule. Defining things algebraically lets us avoid analytic problems, such as discontinuities and the need to prove convergence all the time. Another problem from analysis is the multivalue nature of certain functions, namely the natural logarithm. We get around this by defining \log{a} as the unique function satisfying D\log{a} = \frac{Da}{a}, for a \neq 0. From this definition we can prove the famous logarithmic identities \log{ab} = \log{a} + \log{b} and \log{a^n} = n\log{a} for logarithmic derivatives, again using only the two rules for a derivation given above. For example, D\log{ab}=\frac{Dab}{ab}=\frac{aDb + bDa}{ab} = \frac{bDa}{ab} + \frac{aDb}{ab} = \frac{Da}{a} + \frac{Db}{b}=D\log{a} + D\log{b}=D(\log{a} + \log{b}).

The above definition for the natural logarithm gives the first insight into how the integration algorithm works. We define transcendental functions in terms of their derivatives. So if t = e^x, then Dt/t = 1. We can define all of the trigonometric functions in terms of e^x and \log{x} if we use \sqrt{-1}, but we can also avoid this. For example, if t = \tan{x}, then Dt = 1 + t^2 because \frac{d}{dx}\tan{x} = \sec^2{x} = 1 + \tan^2{x}.

We say that t\in K is a monomial over the field k with respect to a derivation D if it satisfies

1. t is transcendental over k,
2. D[t]\in k[t].

The first condition is necessary because the we are only going to deal with the trancenental version of the Risch Algorithm (the algebraic case is solved too, but the solution method is quite different, and I am not implementing it this summer). The second condition just says that the derivative of t is a polynomial in t and a rational function in x. The functions I mentioned above all satisfy these properties for K = \mathbb{Q}. Theorems in analysis show that \log{x}, e^x, and \tan{x} are all transcendental over \mathbb{Q}[x]. This is actually the only use of analysis that we make in the integration algorithm. Also, we see that if t_1=\log{x}, t_2=e^x, and t_3=\tan{x}, then Dt_1=\frac{1}{x}, Dt_2=t_2, and Dt_3=1 + t_3^2, which are all polynomials in their respective t_i and rational functions in x. In the algorithm, K is actually a tower of monomial extensions of \mathbb{Q}, so t_n is a monomial over \mathbb{Q}(x, t_1, \dots, t_{n-1}). This allows us to work with functions like e^{\tan{x}}. We can’t make t=e^{\tan{x}} directly because \frac{d}{dx}e^{\tan{x}} = (1 + \tan^2{x})e^{\tan{x}} is not a polynomial in t (it also contains \tan{x}) . But if we let t_1 be such that Dt_1=1 + t_1^2, i.e., t_1=\tan{x}, then we can let t_2 be such that Dt_2=(1 + t_1^2)t_2, i.e., t_2=e^{\tan{x}}. Remember that the t_i are all “functions” of x, but there is no need to write t=t(x) as long as we remember that Dt\neq 0, i.e., t\not \in Const_D(K). This is another advantage of using algebraic over analytic methods; it allows us to reduce an integral down to a rational function in the “symbols” x and t_1, t_2, \dots, t_n. By convention, we make the first extension t_0 such that Dt_0=1, i.e., t_0=x. I will just call it x here instead of t_0, to avoid confusion.

This is the preparsing that I alluded to in an earlier post that I have not implemented yet. The reason that I haven’t implemented it yet is not just because I haven’t gotten around to it. We have to be careful when we build up the extension that each element is indeed transcendental over the already built-up field k. For example, although it appears transcendental, the function e^{\frac{1}{2}\log{(1 + x^2)}} is really algebraic because it equals \sqrt{1 + x^2}. There are additional requirements, such that each extension is not the derivative of logarithmic derivative of an element of k (see also the example I gave in the previous post). This is the part that I was talking about in my previous post that is not written out as much as the other algorithms in Bronstein’s book. So this is algorithmically solved, just like the rest of the Algorithm, but it is non-trivial and may end up being the hardest part of the algorithm for me to implement, just because it will probably require the most figuring out on my part.

So we can see that we can convert a transcendental integral, such as the one above, into a rational function in x and monomial extensions t_1, t_2, \dots, t_n. For example, the above integrand would become - \frac{1 + t + t^{2} + 3 x^{2} + 6 t^{2} x^{2} + 7 t x^{2} + t^{3} + 2 x^{4}}{t^{2} x^{2} + 4 t x^{4} + 4 x^{6}}. We then perform certain polynomial manipulations on this integrand, using the fact that Dx=1 and Dt=\frac{1}{x}. For the transcendental case of the Risch Algorithm, this is similar to the rational function integration that I outlined in this post, and has Liouville’s Theorem at its core. This is where I will start off next time.


Quick Update

June 26, 2010

I’ve spend most of this week sitting in a car, so while I have been able to do some work, I haven’t had much time to write up a blog post. So, to comply with Ondrej’s rule, here is a quick update.

I have been working my way through Bronstein’s book. I finished the outer algorithmic layer of the implantation. Basically, the algorithm does polynomials manipulation on the integrand. It first reduces the integrand into smaller integrals, until it gets to an integral where a subproblem must be solved to solve it. The subproblem that must be solved differs depending on the type of the integral. The first one that comes up in Bronstein’s text is the Risch Differential Equation, which arises from the integration of exponential functions. (I will explain all of these thing in more detail in a future blog post). At this point, the algorithms begin to recursively depend on each other, requiring me to implement more and more algorithms at a time in order for each to work. To make things worse, a very fundamental set of algorithms are only described in the text, not given in pseudo-code, so I have had to figure those things out. These are algorithms to determine if a differential extension is a derivative or logarithmic derivative of elements that have already been extended. Again, I will explain better in a future post, but the idea is that you replace elements in an integrand with dummy variables, but each element has to be transcendental over the previous elements. So if you have \int (e^x + e^{x^2} + e^{x + x*^2})dx, and you set t_1 = e^x and t_2 = e^{x^2} (Dt_1 = t_1 and Dt_2 = 2xt_2), then you cannot make t_3 = e^{x + x^2} because e^{x + x^2} = t_1t_2. The ability to determine if an element is a derivative or a logarithmic derivative of an element of the already build differential extension is important not only for building up the extension for the integrand (basically the preparsing), but also for solving some of the cases of the subproblems such as the Risch Differential Equation problem.

So I am still figuring out some of the details on that one. The description in the book is pretty good (this is probably the best written math textbook I have ever seen), but I still have had to figure out some of the mathematical details on paper (which is something I enjoy anyway, but it can be more stressful). Hopefully by the next time I can have some code that is working enough to actually demonstrate solving some complex integrals, (with manual preparsing), and even more excitingly, prove that some non-elementary integrals, such as the classic \int e^{-x^2}dx, are indeed so. And I also hope to have some more explanations on how the Risch algorithm works in future posts.


Strange Python Behavior (can someone please explain to me what is going on here?)

June 16, 2010

Every once in a while, seemingly really simple Python code does something completely unexpected for me. Look at the following snippet of Python code. This is run straight from the 2.6.5 interpreter, with no other commands executed. Do you notice anything strange?

$python
Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>; l = lambda i: a[i]
>>> l
<function at="" 0x39e7f0="">
>>> H = [(1, 2), (3, 4)]
>>> [l(0) + l(1) for a in H]
[3, 7]

Did you spot it? Here is a hint. Running a different but similar session:

$python
Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> l = lambda i: a[i]
>>> l
<function at="" 0x39e7f0="">
>>> l(0)
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1, in 
NameError: global name 'a' is not defined

Do you see it now? I defined the lambda function l in terms of a without defining first defining a! And furthermore, it just works when a is defined. This is actually independent of the fact that we are working in a list comprehension, as this continuation of the previous session shows:

>>> a = [3, 4, 5]
>>> l(0)
3

But I want to expand on the list comprehension example, because there even more bizzare things going on here. Restarting a new session again:

$python
Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> l = lambda i: a[i]
>>> H = [(1, 2), (3, 4)]
>>> [l(0) + l(1) for a in H]
[3, 7]
>>> (l(0) + l(1) for a in H)
<generator object="" at="" 0x3a4350="">
>>> list((l(0) + l(1) for a in H))
[7, 7]

So, if you are astute and have been using Python for long enough, you should be able to catch what is going on here. If you don’t know, here is a hint (continuation of previous session):

>>> a
(3, 4)

So, as you may know, in Python 2.6 and earlier, list comprehension index variables “leek” into the local namespace. The strange thing here is that although the list comprehension would reset it, the generator version does not. Well, normally, it does do this:

>>> x = 1
>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
9
>>> del x
>>> list((x for x in range(10)))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'x' is not defined
>>> x = 1
>>> list((x for x in range(10)))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x
1

So the above bit has something to do with the way the lambda function was defined with the a. By the way, here is what happens with the generator comprehension (is that what these are called?) if a is not defined:

>>> del a
>>> list((l(0) + l(1) for a in H))
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1, in 
  File "", line 1, in 
NameError: global name 'a' is not defined

This is how I discovered this. I had defined a lambda function using an variable that was then passed to a list comprehension that used this variable as the index without realizing it. But then I tried converting this into a generator comprehension to see if it would be faster, and got the above error.

Finally, since the “feature” of leaking list comprehension loop variables into the local namespace is going away in Python 3, I expected things to behave at least a little differently in Python 3. I tried the above in a Python 3.1.2 interpreter and got the following:

$python3
Python 3.1.2 (r312:79147, Mar 23 2010, 22:02:05) 
[GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> l = lambda i: a[i]
>>> l
<function at="" 0x100585a68="">
>>> H = [(1, 2), (3, 4)]
>>> [l(0) + l(1) for a in H]
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1, in 
  File "", line 1, in 
NameError: global name 'a' is not defined
>>> list((l(0) + l(1) for a in H))
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1, in 
  File "", line 1, in 
NameError: global name 'a' is not defined

So in Python 3, both the list comprehension and the generator comprehensions act the same, which is not too surprising. I guess I should recode that piece of code to make it future proof, although this doesn’t seem easy at the moment, and it may require converting a one-liner into a six-liner. If you are interested, the piece of code is here.

So can anyone provide any insight into what is going on with that lambda function? Running it with the -3 switch to python2.6 didn’t give any warnings related to it.

Update: As I noted in a comment, I figured out how to make this future-proof. I need to convert it from

def residue_reduce_derivation(H, D, x, t, z):
    lambdafunc = lambda i: i*derivation(a[1], D, x, t).as_basic().subs(z, i)/ \
         a[1].as_basic().subs(z, i)
    return S(sum([RootSum(a[0].as_poly(z), lambdafunc) for a in H]))

to

def residue_reduce_derivation(H, D, x, t, z):
    return S(sum((RootSum(a[0].as_poly(z), lambda i: i*derivation(a[1], D, x, t).as_basic().subs(z, i)/ \
        a[1].as_basic().subs(z, i)) for a in H)))

Thanks to all the commenters for the explanations.

Also, you may have noticed that I discovered that if you use [code] instead of <code>, you get these nicer code blocks that actually respect indentation! Now I just need to figure out how to make them syntax highlight Python code.

Update 2: [code='py'] colors it! Sweet!

Update 3: I just discovered that SymPy has a Lambda() object that handles this better. In particular, it pretty prints the code, and is what is already being used for RootSum() in the rational function integrator, at least in Mateusz’s polys9.

>>> integrate(1/(x**5 + 1), x)
log(1 + x)/5 + RootSum(625*_t**4 + 125*_t**3 + 25*_t**2 + 5*_t + 1, Lambda(_t, _t*log(x + 5*_t)))                                                   

Still, this has been a very good learning experience.


A Weeklog

June 15, 2010

These seem to be all the rave these days, so I figured, why not jump on the bandwagon:


Aaron-Meurer:doc aaronmeurer20100615153531(integration$)$git weekreport
Aaron Meurer (20):
Fix some bugs in Poly
Make Poly(sin(x)/x*t, t, domain='EX').clear_denoms() work
Fix integrate to work correctly with heurisch.py
Use more efficient gcdexdiophantine() algorithm
Add support for taking the derivation over the coefficient domain in risch.py
Add (but do not yet use) splitfactor_sqf() in risch.py
Add polynomial_reduce() to risch.py
Add tests for algorithms in risch.py in a new test_risch.py file
Only allow coercion to larger domains
Allow coercion from ZZ(a) to ZZ(a, b)
Fix doctest in new heurisch.py file
Add residue_reduce()
Formatting fixes in docstrings in sympy/polys/algebratools.py
Add includePRS option to resultant functions
Add permute method to DMP
Add a test for the includePRS option of resultant()
Have residue_reduce() make S_i monic
Rewrite polynomial_reduce() non-recursively
Add integrate_hypertangent_polynomial()
Add integrate_nonlinear_no_specials()


Integration of rational functions

June 11, 2010

So for this week’s blog post I will try to explain how the general algorithm for integrating rational functions works. Recall that a rational function is the quotient of two polynomials. We know that using common denominators, we can convert the sum of any number of rational functions into a single quotient, \frac{a_nx^n + a_{n-1}x^{n-1} + \cdots + a_2x^2 + a_1x + a_0}{b_nx^n + b_{n-1}x^{n-1} + \cdots + b_2x^2 + a_1x + a_0}. Also, using polynomial division we can rewrite any rational function as the sum of a polynomial and the quotient of two polynomials such that the degree of the numerator is less than the degree of the denominator (F(x) = \frac{b(x)}{c(x)} = p(x) + \frac{r(x)}{g(x)}, with deg(r) < deg(g)). Furthermore, we know that the representation of a rational function is not unique. For example, \frac{(x + 1)(x - 1)}{(x + 2)(x - 1)} is the same as \frac{x + 1}{x + 2} except at the point x = 1, and \frac{(x - 1)^2}{x - 1} is the same as x - 1 everywhere. But by using Euclid’s algorithm for finding the GCD of polynomials on the numerator and the denominator, along with polynomial division on each, we can cancel all common factors to get a representation that is unique (assuming we expand all factors into one polynomial). Finally, using polynomial division with remainder, we can rewrite any rational function F(x) as \frac{a(x)}{b(x)} = p(x) + \frac{a(x)}{d(x)}, where a(x), b(x), c(x), d(x), and p(x) are all polynomials, and the degree of a is less than the degree of d.

We know from calculus that the integral of any rational function consists of three parts: the polynomial part, the rational part, and the logarithmic part (consider arctangents as complex logarithms). The polynomial part is just the integral of p(x) above. The rational part is another rational function, and the logarithmic part, which is a sum of logarithms of the form a\log{s(x)}, where a is an algebraic constant and s(x) is a polynomial (note that if s(x) is a rational function, we can split it into two logarithms of polynomials using the log identities).

To find the rational part, we first need to know about square-free factorizations. An important result in algebra is that any polynomial with rational coefficients can be factored uniquely into irreducible polynomials with rational coefficients, up to multiplication of a non-zero constant and reordering of factors, similar to how any integer can be factored uniquely into primes up to multiplication of 1 and -1 and reordering of factors (technically, it is with coefficients from a unique factorization domain, for which the rationals is a special case, and up to multiplication of a unit, which for rationals is every non-zero constant). A polynomial is square-free if this unique factorization does not have any polynomials with powers greater than 1. Another theorem from algebra tells us that irreducible polynomials over the rationals do not have any repeated roots, and so given this, it is not hard to see that a polynomial being square-free is equivalent to it not having repeated roots.

A square-free factorization of a polynomial is a list of polynomials, P_1P_2^2 \cdots P_n^n, where each P_i is square-free (in other words, P_1 is the product of all the factors of degree 1, P_2 is the product of all the factors of degree 2, and so on). There is a relatively simple algorithm to compute the square-free factorization of a polynomial, which is based on the fact that gcd(P, \frac{dp}{dx}) reduces the power of each irreducible factor by 1. For example:

(Sorry for the picture. WordPress code blocks do not work)

It is not too hard to prove this using the product rule on the factorization of P. So you can see that by computing \frac{P}{gcd(P, \frac{dP}{dx})}, you can obtain P_1P_2\cdots P_n. Then, by recursively computing A_0 = P, A_1 = gcd(A_0, \frac{dA_0}{dx}), A2 = gcd(A_1, \frac{dA_1}{dx}), … and taking the quotient each time as above, we can find the square-free factors of P.

OK, so we know from partial fraction decompositions we learned in calculus that if we have a rational function of the form \frac{Q(x)}{V(x)^n} , where V(x) is square-free, the integral will be a rational function if n > 1 and a logarithm if n = 1. We can use the partial fraction decomposition that is easy to find once we have the square-free factorization of the denominator to rewrite the remaining rational function as a sum of terms of the form \frac{Q}{V_k^k}, where V_i is square-free. Because V is square-free, gcd(V, V')=1, so the Extended Euclidean Algorithm gives us B_0 and C_0 such that B_0V + C_0V'=1 (recall that g is the gcd of p and q if and only if there exist a and b relatively prime to g such that ap+bq=g. This holds true for integers as well as polynomials). Thus we can find B and C such that BV + CV'= \frac{Q}{1-k}. Multiplying through by \frac{1-k}{V^k}, \frac{Q}{V^k}=-\frac{(k-1)BV'}{V^k} + \frac{(1-k)C}{V^{k-1}}, which is equal to \frac{Q}{V^k} = (\frac{B'}{V^{k-1}} - \frac{(k-1)BV'}{V^k}) + \frac{(1-k)C-B'}{V^{k-1}}. You may notice that the term in the parenthesis is just the derivative of \frac{B}{V^{k-1}}, so we get \int\frac{Q}{V^k}=\frac{B}{V^{k-1}} + \int\frac{(1-k)C - B'}{V^{k-1}}. This is called Hermite Reduction. We can recursively reduce the integral on the right hand side until the k=1. Note that there are more efficient ways of doing this that do not actually require us to compute the partial fraction decomposition, and there is also a linear version due to Mack (this one is quadratic), and an even more efficient algorithm called the Horowitz-Ostrogradsky Algorithm, that doesn’t even require a square-free decomposition.

So when we have finished the Hermite Reduction, we are left with integrating rational functions with purely square-free denominators. We know from calculus that these will have logarithmic integrals, so this is the logarithmic part.

First, we need to look at resultants and PRSs. The resultant of two polynomials is defined as differences of the roots of the two polynomials, i.e., resultant(A, B) = \prod_{i=1}^n\prod_{j=1}^m (\alpha_i - \beta_j), where A = (x - \alpha_1)\cdots(x - \alpha_n) and B = (x - \beta_1)\cdots(x - \beta_m) are monic polynomials split into linear factors. Clearly, the resultant of two polynomials is 0 if and only if the two polynomials share a root. It is an important result that the resultant of two polynomials can be computed from only their coefficients by taking the determinant of the Sylvester Matrix of the two polynomials. However, it is more efficiently calculated using a polynomial remainder sequence (PRS) (sorry, there doesn’t seem to be a Wikipedia article), which in addition to giving the resultant of A and B, also gives a sequence of polynomials with some useful properties that I will discuss below. A polynomial remainder sequence is a generalization of the Euclidian algorithm where in each step, the remainder R_i is multiplied by a constant \beta_i. The Fundamental PRS Theorem shows how to compute specific \beta_i such that the resultant can be calculated from the polynomials in the sequence.

Then, if we have \frac{A}{D}, left over from the Hermite Reduction (so D square-free), let R=resultant_t(A-t\frac{dD}{dx}, D), where t is a new variable, and \alpha_i be the distinct roots of R. Let p_i=\gcd(A - \alpha_i\frac{dD}{dx}, D). Then it turns out that the logarithmic part of the integral is just \alpha_1\log{p_1} + \alpha_2\log{p_2} + \cdots \alpha_n\log{p_n}. This is called the Rothstein-Trager Algorithm.

However, this requires finding the prime factorization of the resultant, which can be avoided if a more efficient algorithm called the Lazard-Rioboo-Trager Algorithm is used. I will talk a little bit about it. It works by using subresultant polynomial reminder sequences.

It turns out that the above gcd(A-\alpha\frac{dD}{dx}, D) will appear in the PRS of D and A-t\frac{dD}{dx}. Furthermore, we can use the PRS to immediately find the resultant R=resultant_t(A-t\frac{dD}{dx}, D), which as we saw, is all we need to compute the logarithmic part.

So that’s rational integration. I hope I haven’t bored you too much, and that this made at least a little sense. I also hope that it was all correct. Note that this entire algorithm has already been implemented in SymPy, so if you plug a rational function in to integrate(), you should get back a solution. However, I describe it here because the transcendental case of the Risch Algorithm is just a generalization of rational function integration.

As for work updates, I found that the Poly version of the heursitic Risch algorithm was considerably slower than the original version, due to inefficiencies in the way the polynomials are currently represented in SymPy. So I have put that aside, and I have started implementing algorithms from the full algorithm. There’s not much to say on that front. It’s tedious work. I copy the algorithm from Bronstein’s book, then try make sure that it is correct based on the few examples given and from the mathematical background given, and when I’m satisfied, I move on to the next one. Follow my integration branch if you are interested.

In my next post, I’ll try to define some terms, like “elementary function,” and introduce a little differential algebra, so you can understand a little bit of the nature of the general integration algorithm.


PuDB, a better Python debugger

June 4, 2010

So Christian Muise unwittingly just reminded me on IRC that I forgot to mention the main method that I used to learn how the heurisch function works in my last blog post. I usually only use a debugger when I have a really hard bug I need to figure out, when the print statements aren’t enough. The reason for this is that the debugger that I had been using, winpdb, is, well, a pain to use. There are so many little bugs, at least in Mac OS X, that it is almost not worth while to use it unless I need to. For example, restarting a script from the debugger doesn’t work. If I pass a point that I wanted to see, I have to completely close the winpdb window and restart it from the command line, which takes about half a minute. Also, winpdb uses it’s own variant of pdb, which seems to cause more problems than it creates (like bugging me about sympy importing pdb somewhere every time I start debugging.)

But I really wanted to be able to step through the heurisch code to see exactly how it works, because many of the implementation details, such as gathering the components of an expression, will be similar if not exactly the same in the full algorithm. So I started my quest for a better debugger. For me, the ideal debugger is the C debugger in XCode. That debugger has saved me in most of my programming assignments in C. But it is only for C based languages (C, Objective-C, probably C++, …), not Python. So I did a Google search, and it turns out that there is a list of Python debuggers here. So I went through them, and I didn’t have to go far. The very first one, pudb, turns out to be awesome!

You can watch this screencast to get a better idea of the features, or even better install it and check them out. The debugger runs in the console, not in some half-hacked GUI (half-hacked is what any non-Cocoa GUI looks like in Mac OS X). The only down side to this is that you have to use the keyboard to do everything, but it ends up not being too bad. And you can press ‘?’ at any time to see the possible commands.

To install it, just do easy_install pudb. To run it, just create a script of what you want to debug, and do python -m pudb.run my-script.py and it just works! I have a line that says alias pudb='python -m pudb.run' in my .profile, which makes it even easier to run. If you want to set a break point in the code, you can either navigate there from within pudb by pressing ‘m’, or you add a line that says from pudb import set_trace; set_trace() to the code (if you add the line to your code, you don’t even need to create a script. Just execute the code in IPython and when it hits that line, it will load the debugger).

Some cool features:

– IPython console. Just press ‘!’ to go to a console, where you can manipulate variables from the executed namespace, and you can choose an IPython console.

– Very easy to navigate. You just need to know the keys ‘s’, ‘n’, and ‘t’.

– View the code from elsewhere than what is being run. Pressing ‘m’ lets you view all imported modules. You can easily view points on the stack by choosing them.

– If an exception is thrown, it catches it! This may sound obvious for a debugger, but it is one of things that didn’t work very well in winpdb. You can view the traceback of the exception, and choose to restart without having to close and reopen the debugger. Actually, it asks you if you want to restart every time the script finishes too, which is also a great improvement over winpdb.

This is what it looks like. Click for a bigger picture:

This is where the heurisch algorithm hangs.

Some annoyances (in case Andreas Kloeckner reads this):

– The default display for variables is type, which is completely useless. I have to manually go through and change each to str so I can see what the variable is. Is there a way to change this default?

– It asks me every time if I want to use IPython. I always want to use IPython.

– This is might be a Mac OS X Terminal bug, but when I execute a statement that takes a while to run, it doesn’t redraw the pudb window until it finishes. This means that stepping through a program “flashes” black from what is above pudb in the window, and if I run a statement that takes forever, I loose the ability to see where it is unless I keyboard interrupt. Fortunately, it catches keyboard interrupts, so I can still see the traceback.

– There is no way to resize the variables window, or to scroll sideways in it. If I want to see what a long variable expression is, I have to go to the IPython console and type it there.

Some of these might be fixable and I just don’t know it yet. But even with them, this is still an order of magnitude improvement over winpdb. Now I can actually use the debugger all the time in my coding, instead of just when I have a really tough bug and no other choice.

UPDATE:

The first two were trivial to fix in a fork of the repository (isn’t open source awesome?). So if those are bothering you too, check out my branches at http://github.com/asmeurer/PuDB. Maybe if I have some time I will make them global options using environment variables or something and see if Andreas wants to merge them back into the main repo.

As for the second one, I realized that it might be a good thing, because you can see anything that is printed. Still, I would prefer seeing both, if possible (and the black flashes are annoying).

UPDATE 2:

You can resize the side view by pushing +/-, though there doesn’t seem to be a way to, say, make the variables view bigger and the breakpoints view smaller.

UPDATE 3:

A while back Ondrej modified the code to have a different color theme, and I followed suit. See this conversation at GitHub. So now, instead of looking like a DOS terminal, in PuDB for me looks like this:

PuDB XCode Midnight Theme Colors

PuDB XCode Midnight Theme Colors

This is exactly the same colors as my code in XCode, the editor I use, with the Midnight Theme. It’s pretty easy to change the colors to whatever you want. Right now, you have to edit the source, but Ondrej or I might someday make it so you can have themes.

Also, having used this all summer (and it was a life-saver having it in multiple occasions, and I am sure made my development speed at least twice as fast in others), I have one additional gripe. It is too difficult to arrow up to the variable that you want to access in the variables view. It would be nice to have a page up/page down feature there.

UPDATE 4: PuDB has since improved a lot, include many fixes by myself. It now supports themes, saved settings, variable name wrapping, and more. See this followup post.


Update for this week

June 4, 2010

So I started writing up a blog post on how rational function integration works, but Ondrej wants a blog post every week by the end of I don’t think I would do it justice by rushing to finish it now (read: I’m to lazy to do it). So instead, I’ll just give a short post (if that’s possible for me) on what I have been doing this week.

I finished up writing doctests for the polynomials module for now (see issue 1949), so now this week I started looking at the integrator. In particular, I went through each of the 40 issues with the Integration label and added them to a test file that I can monitor throughout the summer to see my progress. It is the test_failing_integrals.py file in my Integration branch, where all my work will be going for the foreseeable future. So if you want to follow my work, follow that branch. Here are some observations from those issues:

– integrate() can’t handle almost all algebraic integrals (functions with square roots, etc.). It can handle the derivative of arcsin and arcsinh because of special code in heurisch.py, but that’s about it. Before I can do any work on the Algebraic Risch Algorithm, I will need to implement the transcendental algorithm, so I think my temporary solution for this may be to add pattern matching heuristics for some of the more common algebraic integrals (anyone know a good integral table?).

– I figured out why integrate hangs forever with some integrals, such as the one in issue 1441. Here is, in a nutshell, how the Heuristic Risch algorithm works: Take the integrand and split it into components. For example, the components of x*cos(x)*sin(x)**2 are [x, cos(x), sin(x)]. Replace each of these components with a dummy variable, so if x = x0, cos(x) = x1, and sin(x) = x2, then the integrand is x0*x1*x2**2. Also, compute the derivative of each component in terms of the dummy variables. So the derivatives of [x0, x1, x2] are [1, -x2, 2*x1*x2]. Then, using these, perform some magic to create some rational functions out of the component dummy variables. Then, create a candidate integral with a bunch of unknowns [A1, A2, …], which will be rational numbers, and a multinomial of the An’s and the xn’s that should equal 0 if the candidate integral is correct. Then, because the xn’s are not 0, and there is also some algebraic independence, you have the the An coefficients of each term must equal 0. So you get a system of linear equations in the An’s. You then solve these equations, and plug the values of the An’s into the candidate integral to give you the solution, or, if the system is inconsistent, then if cannot find a solution, possibly because there is no elementary one.

Well, that over simplifies a lot of things, but the point I want to make is that the integral from issue 1441 creates a system of ~600 linear equations in ~450 variables, and solving that equation is what causes the integration to hang. Also, as Mateusz, my mentor and the one who wrote the current integration implementation, pointed out, quite a bit of time is spent in the heurisch algorithm doing expansion on large Basic polynomials. When I say Basic polynomials, I mean that they are SymPy expressions, instead of Poly instances. Using Poly should speed things up quite a bit, so my next move will be to convert heurisch() into using Poly wherever applicable.

– There were a few bugs in the rational integration, which I fixed in my branch. The problem was in rational integrals with symbolic coefficients. Because the new polys are able to create polynomials using any expression as a generator, not just symbols, things like Poly(sin(y)*x, x) creates Poly(sin(y)*x, x, domain=’ZZ[sin(y)]’). But using the polynomial ring or fraction field creates problems with some things like division, whereas we really only want the domain to be EX (expression domain) in this case. So this was not too difficult to fix, and you can see the fix in my integration branch.

– Some integrals will require some good implementation of special functions such as the hypergeometric function to work. Sometimes, you don’t want to know what the non-elementary integral looks like, but you just want to calculate a definite integral. The solution here is to use Meijer-G functions, which are on the list of things to possibly do at the end of the summer if I have time.

– Another bug that I plan on fixing (I haven’t done it yet, but I know how to do it and it will be trivial), is this (issue 1888):

In [18]: print integrate(f(x).diff(x)**2, x)

2*D(f(x), x)*f(x)/3 – 2*x*D(f(x), x, x)*f(x)/3 + x*D(f(x), x)**2/3

The problem is in the step where it computes the derivative of the components, it tries to compute the derivative of f(x).diff(x) in terms of a dummy variable, but it reduces to 0 because diff(x2, x) == 0. Thus, it treats f(x).diff(x) like something that has a 0 third derivative, i.e., x**2.

Well that’s it. I knew I couldn’t make a short blog post :). If you want to help, I have three branches that need review (1, 2, 3), and except for the last one, my work is based on top of the other two, so none of my integration work can be pushed in until those two reviewed positively.