How to get both 32-bit and 64-bit Python in Snow Leopard

November 13, 2009

We had some discussion on one of the Python issues about whether my Python in Snow Leopard should be 32-bit or 64-bit. I originally thought that it was tied to what the kernel was, but I turned out to be wrong.

From what I discovered, the important thing is what the Python was compiled as. You can tell what your Python has been compiled as by running:

>>> import sys
>>> from math import log
>>> log(sys.maxsize)

If this is just under 31, then it is 32 bit. If it returns 63, then it is 64. An easier way to tell it to run:

>>> 2**40

If you get 1099511627776L, then you have 32-bit Python, if you get 1099511627776, you have 64-bit Python (notice that the number is long in 32-bit Python, because it is larger than maxint).

This test won’t work in Python 3 because all integers are “long” by default, but the first part will still work.

So why does this matter, you ask? Well, aside from the fact that much longer numbers are not long (anything less than 2**63 – 1 = 9223372036854775807), there is the issue of hashing.

In 64-bit Python:

>>> hash('a')
12416037344

but in 32-bit Python

>>> hash('a')
-468864544

SymPy uses hash values to order arguments, so often it happens that behavior in one architecture will not show up in the other. These problems are often hard to track and fix, but the worst is when things work fine on the machine you are working on. This actually happened to me with my GSoC project. I was renumbering the arbitrary constants in the printing order in an expression, but it turned out that the printing order of an expression can be dependent on .args order, so I had to modify the tests to canonize the numbering first.

So here comes the crux of the post. It turns out that on Mac OS X, if you install the binary from python.org (Mac Installer Disk Image), this installs a 32-bit Python (for compatibility reasons) in /Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6
. However, if you install Python using 64-bit fink in Snow Leopard, it will compile it from source into 64-bit, and install it into /sw/bin/python2.6.

So now I have an easy way to test both architectures without having to ssh into some other machine, which was what I was doing before.


Google Summer of Code 2009 Wrap Up

September 7, 2009

Sorry about the extreme delay with this. I of course have been busy with classes.

Note that this will just be a summary of the summer, with my comments looking back on it. If you want more details on each individual thing that I implemented, look back on my previous blog posts.

Let me start from the beginning. Around late February to early March of this year, I discovered the existence of Google Summer of Code. I knew that I wanted to do some kind of work this summer, preferably an internship, so it piqued my interest. At that time, the mentoring organizations were still applying for GSoC 2009, so I could only look at the ones from 2008. Most of them were either Linux things or Web things, neither of which I had any experience in or am I much interested in. I took a free course in Python at my University the previous semester, and it was the programming language that I knew best at the time. I had learned some Java in my first semester CS class (did I mention that this was my first year at college?), and I hated it, and I was still learning C for my second semester CS class. So I looked at what the Python Foundation had to offer. I am a double major in math and computer science, so I looked under the math/science heading. That’s when I saw SymPy.

I should not that I have been ahead in Math. It was my second semester, and I was taking Discrete Mathematics, Ordinary Differential Equations, Basic Concepts of Math, and Vector Analysis. So I looked for project ideas on the SymPy page that related to what I knew. The only one that I saw, other than core improvements, was to improve the ODE solving capabilities. I got into contact with the community and looked at the source, finding that it was only capable of solving 1st order linear equations and some special cases of 2nd order linear homogeneous equations with constant coefficients. I already at that point knew several methods from my ODE course, and I knew much of what I would learn.

Application Period
The most difficult part of the Google Summer of Code, in my opinion, is the application period. For starters, you have to do it while you are still in classes, so you pretty much have to do it in your free time. Also, if you have never applied for Google Summer of Code before, you do not really know what a good application should look like. I have long had my application available on the SymPy Wiki, and I will reference it here a few times. First off, it was recommended to me by some of the SymPy developers that I put as many potential things that I could do in the summer in my application as I though I could do. I was still only about half way through my ODEs course when I wrote the application, but I had the syllabus so I knew the methods I would be learning at least by name. So that is exactly what I did: I packed my application with every possible thing that I knew we would be learning about in ODEs.

After I felt that I had a strong application, and Ondrej had proofread it for me, I submitted it. There were actually two identical applications, one for the Python Software Foundation, and one for Portland State University. This is because SymPy was not accepted as a mentoring organization directly, so they had to use those two foundations as proxies. A requirement of acceptance is to submit a patch that passes review. I decided to add a Bernoulli solver, because Bernoulli can be solved in the general case much like the 1st order linear ODE, which was already implemented.

After I applied, there was an acceptance period. I used that period to become aquatinted with the SymPy community and code base. A good way to do this is to try to fix EasyToFix issues. I found issue 694, which is to implement a bunch of tests from a paper by Michael Wester for testing computer algebra systems. The tests cover every possible thing that a full featured CAS could do, so it was a great way to learn SymPy. The issue is still unfinished, so working on it is still a good way to learn how to use SymPy.

Also, it was important to learn git, SymPy’s version control system. The learning curve it pretty steep if you have never used version control system before, but once you can use it, it becomes a great tool at your disposal.

Acceptance
After being accepted, I toned down my work with SymPy to work on finishing up my classes. My classes finished a few weeks before the official start, so I used that period to get a jump start on my project.

The GSoC Period
For the start of the period, I followed my timeline. I implemented 1st order ODEs with homogeneous coefficients and 1st order exact ODEs. These were both pretty simple to do, as I expected.

The next thing I wanted to do was separable. My goal at this point was to get every relevant exercise from my textbook to work with my solvers. One of the exercises from my book (Pg. 56, No. 21) was dy=e^{x + y}dx. I soon discovered that it was impossible for SymPy to separate e^{x + y} \rightarrow e^{x}e^{y}, because the second would be automatically combined in the core. I also discovered that expand(), which should have been the function to split that, expanded using all possible methods indiscriminately. Part of my separatevars() function that I was writing to separate variables in expressions would be to split things like x + yx \rightarrow x(1 + y) and 2 x y + x^{2} + y^{2} \rightarrow (x + y)^{2}, but expand()
as it was currently written would expand those.

So I spent a few weeks hacking on the core to make it not auto-combine exponents. I came up with a rule that exponents would only auto-combine if they had the same term minus the coefficient, the same rule that Add uses to determine what terms should auto combined by addition. So it would combine e^{x}e^{x} \rightarrow e^{2x}, but e^{x}e^{y} would be left alone. It turns out that some of our algorithms, namely the Gruntz limit algorithm, relies on auto-combining. We already had a function that could combine exponents, powsimp(), but it also combined bases, as in x^zy^z \rightarrow (xy)^z, so I had to split the behavior so that it could act only as auto-combining once did (by the way, use powsimp(expr, combine='exp', deep=True) to do this). Then, after some help from Ondrej on pinpointing the exact location of the bugs, I just applied the function there. The last thing I did here was to split the behavior of expand, so that you could do expand(x*(y + 1), mul=False) and it would leave it alone, but expand(exp(x + y), mul=False) would return exp(x)*exp(y). This split behavior turned out to be useful in more than one place in my code.

This was the first non bug fix patch of mine that was pushed in to SymPy, and at the time of this writing, it is the last major one in the latest stable version. It took some major rebasing to get my convoluted commit history ready for submission, and it was during this phase that I git finally clicked for me, especial the git rebase command. This work took a few weeks from my ODEs time, and it became clear that I would not be doing every possible thing from my application. The reason that I included so much in my application was that my project was non-atomic. I could implement a little or a lot and still have a working useful module.

If you look at my timeline on my application, you can see that the first half is symbolic methods, and the second half is other methods, things like series. It turns out that we didn’t really learn much about systems of ODEs in my course and we learned very little about numerical methods (and it would take much more to know how to implement them). We did learn series methods, but they were so annoying to do that I came to hate them with a passion. So I decided to just focus on symbolic methods, which were my favorite anyway. My goal was to implement as many as I could.

After I finished up separable equations, I came up with an idea that I did not have during the application period. dsolve() was becoming cluttered fast with all of my solution methods. The way that it worked was that it took an ODE and it tried to match methods one by one until it found one that worked, which it then used. This had some drawbacks. First, as I mentioned, the code was very cluttered. Second, the ODEs methods would have to be applied in a predetermined order. There are several ODEs that match more than one method. For example, 2xy + (x^2 + y^2)\frac{dy}{dx}=0 has coefficients that are both homogeneous of order 2, and is also exact, so it can be solved by either method. The two solvers return differently formatted solutions for each one. A simpler example is that 1st order ODEs with homogeneous coefficients can be solved in two different ways. My working solution was to try them both and then apply some heuristics to return the simplest one. But sometimes, one way would create an impossible integral that would hand the integration engine. And it made debugging the two solvers more difficult because I had to override my heuristic. This also touches on the third point. Sometimes the solution to an ODE can only be represented in the form of an unevaluatable integral. SymPy’s integrate() function is supposed to return an unevaluated Integral class if it cannot do it, but all too often it will just hang forever.

The solution I came up with was to rewrite dsolve using a hints method. I would create a new function called classify_ode() that would do all of the ODE classification, removing it from the solving code. By default, dsolve would still use a predetermined order of matching methods. But you could override it by passing a “hint” to dsolve for any matching method, and it would apply that method. There would also be options to only return unevaluated integrals when applicable.

I ended up doing this and more (see the docstrings for classify_ode() and dsolve() in the current git master branch), but before I could I needed to clean up some things. I needed to rewrite all of dsolve() and related functions. Before I started the program, there were some special cases in dsolve for second order linear homogeneous ODEs with constant coefficients and one very special case ODE for the expanded form of \frac{d^2}{dx^2}(xe^{-y}) = 0.

So the first thing I did was implement a solver for the general homogeneous linear with constant coefficients case. These are rather simple to do: you just find the roots of the characteristic polynomial built off of the coefficients, and then put the real parts of the roots in front of the argument of an exponential and the imaginary parts in front of the arguments of a sine and cosine (for example, 3 \pm 2i would give C1e^{3x}\sin{2x} + C2e^{3x}\cos{2x}. The thing was, that if the imaginary part is 0, then you only have 1 arbitrary constant on the exponential, but if it is non-zero, you get 2, one for each trig function. The rest falls out nicely if you plug 0 in for b into $e^{ax}(C1\sin{bx} + C2\cos{box})$ because the sine goes to 0 and the cosine becomes 1. But you would end up with C1 + C2 instead of just C1 in that case. I had already planned on doing arbitrary constant simplification as part of my project, so I figured I would put this on hold and do that first. Then, once that was done, the homogeneous case would be reduced to 1 case instead of the usual 2 or 3.

My original plan was to make an arbitrary constant type that automatically simplified itself. So, for example, if you entered C1 + 2 + x with C1 an arbitrary constant, it would reduce to just C1 + x. I worked with Ondrej, including visiting him in Los Alamos, and we build up a class that worked. The problem was that, in order to have auto-simplification, I had to write the simplification directly into the core. Neither of us liked this, so we worked a little bit on a basic core that would allow auto-simplification to be written directly in the classes instead of in the Mul.flatten() and Add.flatten() methods. It turns out that my constant class isn’t the only thing that would benefit from this. Things like the order class (O(x)) and the infinity class (oo) are auto-simplified in the core, and things could be much cleaner if they happened in the classes themselves. Unfortunately, modifying the core like this is not something that can happen overnight or even in a few weeks. For one thing, it needed to wait until we had the new assumptions system, which was another Google Summer of Code project running parallel to my own. So we decided to shelf the idea.

I still wanted constant simplification, so I settled with writing a function that could do it instead. There were some downsides to this. Making the function as general as the classes might have been would have been far too much work, so I settled on making it an internal-only function that only worked on symbols named C1, C2, etc. Also, unlike writing the simplification straight into Mul.flatten() which was as simple as removing any terms that were not dependent on x, writing a function that parsed an expression and simplified it was considerably harder to write. I managed to churn out something that worked, and so I was ready to finish up the solver I had started a few paragraphs ago.

After I finished that, I still needed to maintain the ability to solve that special case ODE. Apparently, it is an ODE that you would get somewhere in deriving something about relativity, because it was in the relativity.py example file. I used Maple’s excellent odeanalyser() function (this is where I go the idea for my classify_ode())to find a simple general case ODE that it fit (Liouville ODE). After I finished this, I was ready to start working on the hints engine.

It took me about a week to move all classification code into classify_ode(), move all solvers into individual functions, separate simplification code into yet other functions, and tie it all together in dsolve(). In the end, the model worked very well. The modularization allowed me to do some other things that I had not considered, such as creating a special “best” hint that used some heuristics that I originally developed for first order homogeneous which always has two possible solutions to try to give the best formatted solution for any ODE that has more than one possible solution method. It also made debugging individual methods much easier, because I could just use the built in hint calls in dsolve() instead of commenting out lines of code in the source.

This was good, because there was one more method that I wanted to implement. I wanted to be able to solve the inhomogeneous case of a nth order linear ode with constant coefficients. This can be done in the general case using the method of variation of parameters. It was quite simple to set up variation of parameters up in the code. You only have to set up a system of integrals using the Wronskian of the general solutions. It would usually be a very poor choice of a method if you were trying to solve an ODE by hand because taking the Wronskian and computing n integrals is a lot of work. But for a CAS, the work is already there. I just have to set up the integrals.

It soon became clear that even though, in theory, the method of variation of parameters can solve any ODE of this type, in practice, it does not always work so well in SymPy. This is because SymPy have very poor simplification, especially trigonometric simplification, so sometimes there would be a trigonometric Wronskian that would be identically equal to some constant, but it could only simplify it to some very large rational function of sines and cosines. When these were passed to the integral engine, it would cause it to fail, because it could not find the integral for such a seemingly complex expression.

In addition, taking Wronskians, simplifying them, and then taking n integrals is a lot of work as I said, and even when SymPy could do it, it took a long time. There is another method for solving these types of equations called undetermined coefficients that does not require integration. It only works on a class of ODEs where the right hand side of the ODE is a simple combination of sines, cosines, exponentials, and polynomials in x. It turns out that these kinds of functions are common anyway, so most ODEs of this type that you would encounter could be solved with this method. Unlike variation of parameters, undetermined coefficients requires considerable setup, including checking for different cases. This would be the method that you would want to use if you had to solve the ODE by hand because, even with all the setup, it only requires solving a system of linear equations vs. solving n integrals with variation of parameters, but for a CAS, it is the setup that matters, so this was a difficult prospect.

I spent the last couple of weeks writing up the necessary algorithms to setup the required system of linear equations and handling the different cases. After I finally worked out all of the bugs, I ran some profiling against my variation of parameters solver. It turned out that for ODEs that had trigonometric solutions (which take longer to simplify), my undetermined coefficients solver was an order of magnitude faster than the variation of parameters solver (and that is just for the ODEs that the variation of parameters engine could even solve at all). For ODEs that only had exponentials, it was still 2-4 times faster.

I finished off the summer by writing extensive documentation for all of my solvers and functions. Hopefully someone who uses SymPy to solve ODEs can learn something about ODE solving methods as well as how to use the function I wrote when they read my documentation.

Post-GSoC
I plan on continuing development with SymPy now that the Google Summer of Code period is over. SymPy is an amazing project, mixing Python and Computer Algebra, and I want to help it grow. I may even apply again in a future year to implement some other thing in SymPy, or maybe apply as a mentor for SymPy to help someone else improve it.

Advice
What follows is some general advice for someone who wants to apply for Google Summer of Code. Some of the advice pertains specifically to SymPy, and some of it is general advice that I think would apply to any project.

- Get involved early. As soon as you decide that you want to participate in Google Summer of Code, start getting involved in the project. Get into contact with them and discuss possible projects. If you are looking before the participating organizations are announced, look at the organizations from previous years. For some organizations, it will vary; for others (like Python), it is almost given that they will be accepted every year.

- Some projects (including SymPy) require you to send in a patch that passes review to be accepted. This will give you a change to start familiarizing yourself with the code base. If you are applying to SymPy, the Wester example I mentioned above is a really good way to learn what SymPy can do and how it works.

- Subscribe to the mailing list, and once you are comfortable with it, participate. Also, it is a good idea to idle in IRC (SymPy is on freenode at #sympy). This will help you get to know the main contributors for the project.

- For you application, see if the people in the project you are applying for will review it. If they like your project idea, they will try to help you write a good application so you can be accepted and you can implement it. If they don’t like your idea, then they will tell you and you should change it, otherwise you will not be accepted, no matter how well written your proposal is. I have my proposal on the wiki (see link above). I am not saying that it is necessarily a very good proposal, but it did get accepted. If you are applying to SymPy, Ondrej will proofread your applications for you.

- If you are an IRC fan, there is also #gsoc on freenode, where you can ask all your GSoC related questions. Be warned that it does get pretty noisy in the application period, especially right before the applications are due and right before proposals are accepted.

- I cannot stress this one enough. If you have never worked with a version control system before, it is perhaps more important to spend your time learning it than it is to learn the code base for your project. These things have a steep learning curve if you have never used them before. Once you master them though, they can make your life much easier. Also, the sooner you learn to use them well, the easier your life will be later on down the road. I spent a good part of the last week of GSoC cleaning up my commit history from the first half of the summer when I bad very poor committing/log habits. If your project uses git, such as SymPy does, you might look at this tutorial. If it uses something else, good luck. Seriously, git is the only good version control system. See this video.

- Expect to spend only about half of the summer actually implementing stuff. You may think that you are a good programmer and that your code will not be so buggy that you will need to spend that much time fixing bugs, and you may be right. But the fact is, you will be working on code bases written by may programmers that are not so good. You will need to fix several already existing bugs to make your code work, which means that you will need to learn the code base well, learn how to read other people’s code, and how to fix bugs that you had no part in creating. You will be glad if a bug is in your code because you will usually know immediately what causes it and how to fix it. But if a bug is somewhere else, you will need to find it, figure out why it happens, what is supposed to happen, and how to fix it without breaking anything else. This is also why it is important to be active in the developer community.

- Good luck.


Undetermined Coefficients

August 17, 2009

[Sorry for the delay in this post. I was having some difficulties coming up with some of the rationales below. Also, classes have started, which has made me very busy.]

If there was one ODE solving method that I did not want to implement this summer, it was undetermined coefficients. I didn’t really like the method too much when we did it my my ODE class (though it was not as unenjoyable as series methods). The thing that I never really understood very well is to what extent you have to multiply terms in the trial function by powers of x to make them linearly independent of the solution to the general equation. We did our ODEs homework in Maple, so I would usually just keep trying higher powers of x until I got a solution. But to implement it in SymPy, I had to have a much better understanding of the exact rules for it.

From a user’s point of view, the method of undetermined coefficients is much better than the method of variation of parameters. While it is true that variation of parameters is a general method and undetermined coefficients only works on a special class of functions, undetermined coefficients requires no integration or advanced simplification, so it is fast (very fast, as well shall see below). All that the CAS has to do is figure out what a trial function looks like, plug it into the ODE, and solve for the coefficients, which is a system of linear equations.

On the other hand, from the programmer’s point of view, variation of parameters is much better. All you have to do is take the Wronskian of the general solution set and use it to set up some integrals. But the Wronskian has to be simplified, and if the general solution contains sin’s and cos’s, this requires trigonometric simplification not currently available in SymPy (although it looks like the new Polys module will be making a big leap forward in this area). Also, integration is slow, and in SymPy, it often fails (hangs forever).

Figuring out what the trial function should be for undetermined coefficients is way more difficult to program, but having finnally finished it, I can say that it is definitely worth having in the module. Problems that it can solve can run orders of magnitude faster than the variation of parameters, and often variation of parameters can’t do the integral or returns a less simplified result.

So what is this undetermined coefficients? Well, the idea is this: if you knew what each linearly independent term of the particular solution was, minus the coefficients, then you could just set each coefficient as an unknown, plug it into the ODE, and solve for them. It turns out that resulting system of equations is linear, so if you do the first part right, you can always get a solution.

The key thing here is that you know what form the particular solution will take. However, you don’t really know this ahead of time. All you have is the linear ode a_ny^{(n)}(x) + \dots + a_1y'(x) + a_0y(x) = F(x) (as far as I can tell, this only works in the case where the coefficients a_i are constant with respect to x. I’d be interested to learn that it works for other linear ODEs. At any rate, that is the only one that works in my branch right now.). The solution to the ode is y(x) = y_g(x) + y_p(x), where y_g(x) is the solution to the homogeneous equation f(x) \equiv 0, and y_p(x) is the particular solution that produces the F(x) term on the right hand side. The key here is just that. If you plug y_p(x) into the left hand side of the ode, you get F(x).

It turns out that this method only works if the function F(x) only has a finite number of linearly independent derivatives (I am unsure, but this might be able to work in other cases, but it would involve much more advanced mathematics). So what kind of functions have a finite number of linearly independent solutions? Obviously, polynomials do. So does e^x, \cos{x}, and \sin{x}. Also, if we multiply two or more of these types together, then we will get a finite number of linearly independent solutions after applying the product rule. But is that all? Well, if we take the definition of linear independence from linear algebra, we know that a set of n vectors \{\boldsymbol{v_1}, \boldsymbol{v_2}, \boldsymbol{v_3}, \dots, \boldsymbol{v_n}\}, not all zero, are linearly independent only if a_1\boldsymbol{v_1} + a_2\boldsymbol{v_2} + a_3\boldsymbol{v_3} + \dots + a_n\boldsymbol{v_n}=0 holds only when a_1 \equiv 0, a_2 \equiv 0, a_3 \equiv 0, \dots, a_n \equiv 0, that is, the only solution is the trivial one (remember, this is the definition of linear independence). They are linearly dependent if there exist weights a_1, a_2, a_3, \dots, a_n, not all 0, such that the equation a_1\boldsymbol{v_1} + a_2\boldsymbol{v_2} + a_3\boldsymbol{v_3} + \dots + a_n\boldsymbol{v_n}=0 is satisfied. Using this definition, we can see that a function f(x) will have a finite number of linearly independent derivatives if it satisfies a_nf^{(n)}(x) + a_{n - 1}f^{(n - 1)}(x) + \dots + a_1f'(x) + a_0f(x) = 0 for some n and with a_i\neq 0 for some i. But this is just a homogeneous linear ODE with constant coefficients, which we know how to solve. The solutions are all of the form ax^ne^{b x}\cos{cx} or ax^ne^{b x}\sin{cx}, where a, b, and c are real numbers and n is a non-negative integer. We can set the various constants to 0 to get the type we want. For example, for a polynomial term, b will be 0 and c will be 0 (use the cos term).

So this gives us the exact form of functions that we need to look for to apply undetermined coefficients, based on the assumption that it only works on functions with a finite number of linearly independent derivatives.

Well, implementing it was quite difficult. For every ODE, the first step in implementation is matching the ODE, so the solver can know what methods it can apply to a given ODE. To match in this case, I had to write a function that determined if the function matched the form given above, which was not too difficult, though not as trivial as just grabbing the right hand side in variation of parameters. The next step is to use the matching to format the ODE for the solver. In this case, it means finding all of the finite linearly independent derivatives of the ODE, so that the solver can just create a linear combination of them solve for the coefficients. This was a little more difficult, and it took some lateral thinking.

At this point, there is one more thing that needs to be noted. Since the trial functions, that is, the linearly independent derivative terms of the right hand side of the ODE, are of the same form as the solutions to the homogeneous equation, it is possible that one of the trial function terms will be a solution to the homogeneous equation. If this happens, plugging it into the ODE will cause it to go to zero, which means that we will not be able to solve for a coefficient for that term. Indeed, that term will be of the form C1*\textrm{term} in the final solution, so even if we had a coefficient for it, it would be absorbed into this term from the solution to the homogeneous equation. For example, variation of parameters will give a coefficient for such terms, even though it is unnecessary. This is a clue that Maple uses variation of parameters for all linear constant coefficient ODE solving, because it gives the unnecessary terms with the coefficients that would be given by variation of parameters, instead of absorbing them into the arbitrary constants.

We can safely ignore these terms for undetermined coefficients, because their coefficients will not even appear in the system of linear equations of the coefficients anyway. But, without these coefficients, we will run into trouble. It turns out that if a term x^ne^{ax}\sin{bx} or x^ne^{ax}\cos{bx} is repeated solution to the homogeneous equation, and x^{n + 1}e^{ax}\sin{bx} or x^{n + 1}e^{ax}\cos{bx} is not, so that n is the highest x power that makes it a solution to the homogeneous equation, and if the trial solution has x^me^{ax}\sin{bx} or x^me^{ax}\cos{bx} terms, but not x^{m + 1}e^{ax}\sin{bx} or x^{m + 1}e^{ax}\cos{bx} terms, so that m is the highest power of x in the the trial function terms, then we need to multiply these trial function terms by x^{n + m} to make them linearly independent with the solutions of the homogeneous equation.

Most references simply say that you need to multiply the trial function terms by “sufficient powers of x” to make them linearly independent with the homogeneous solution. Well, this is just fine if you are doing it by hand or you are creating the trial function manually in Maple and plugging it in and solving for the coefficients. You can just keep upping the powers of x until you get a solution for the coefficients. Creating those trial functions in Maple, plugging them into the ODE, and solving for the coefficients is exactly what I had to do for my homework when I took ODEs last spring, and this “upping powers” trial and error method is exactly the method I used. But when you are doing it in SymPy, you need to know exactly what power to multiply it by. If it is too low, you will not get solution to the coefficients. If it is too high, you can actually end up with too many terms in the final solution, giving a wrong answer.

Fortunately, my excellent ODEs textbook gives the exact cases to follow, and so I was able to implement it correctly. The textbook also gives a whole slew of exercises, all for which the solutions are given. As usual, this helped me to find the bugs in my very complex and difficult to write routine. It also helped me to find a match bug that would have prevented dsolve() from being able to match certain types of ODEs. The bug turned out to be fundamental to the way match() is written, so I had to write my own custom matching function for linear ODEs.

The final step in solving the undetermined coefficients is of course just creating a linear combination of the trial function terms, plugging it into the original ODE, and setting the coefficients of each term on each side equal to each other, which gives a linear system. SymPy can solve these easily, and once you have the values of the coefficients, you can use them to build your particular solution, at which point, you are done.

The results were astounding. Variation of parameters would hang on many simple inhomogeneous ODEs because of poor trig simplification of the Wronsikan, but my undetermined coefficients method handles them perfectly. Also, there is no need to worry about absorbing superfluous terms into the arbitrary constants as with variation of parameters, because they are removed from within the undetermined coefficients algorithm.

But the biggest thing was speed. Here are some benchmarks on some random ODEs from the test suite. WordPress code blocks are impervious to whitespace, as I have mentioned before, so no pretty printing here. Also, it truncates the hints. The hints used are 'nth_linear_constant_coeff_undetermined_coefficients' and 'nth_linear_constant_coeff_variation_of_parameters':

In [1]: time dsolve(f(x).diff(x, 2) - 3*f(x).diff(x) - 2*exp(2*x)*sin(x), f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.07 s, sys: 0.00 s, total: 0.08 s
Wall time: 0.08 s
Out[2]:
f(x) == C1 + (-3*sin(x)/5 - cos(x)/5)*exp(2*x) + C2*exp(3*x)

In [3]: time dsolve(f(x).diff(x, 2) - 3*f(x).diff(x) - 2*exp(2*x)*sin(x), f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 0.92 s, sys: 0.01 s, total: 0.93 s
Wall time: 0.94 s
Out[4]:
f(x) == C1 + (-3*sin(x)/5 - cos(x)/5)*exp(2*x) + C2*exp(3*x)

In [5]: time dsolve(f(x).diff(x, 4) - 2*f(x).diff(x, 2) + f(x) - x + sin(x), f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.06 s, sys: 0.00 s, total: 0.06 s
Wall time: 0.06 s
Out[6]:
f(x) == x - sin(x)/4 + (C1 + C2*x)*exp(x) + (C3 + C4*x)*exp(-x)

In [7]: time dsolve(f(x).diff(x, 4) - 2*f(x).diff(x, 2) + f(x) - x + sin(x), f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 5.43 s, sys: 0.03 s, total: 5.46 s
Wall time: 5.52 s
Out[8]:
f(x) == x - sin(x)/4 + (C1 + C2*x)*exp(x) + (C3 + C4*x)*exp(-x)

In [9]: time dsolve(f(x).diff(x, 5) + 2*f(x).diff(x, 3) + f(x).diff(x) - 2*x - sin(x) - cos(x), f(x), 'nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.10 s, sys: 0.00 s, total: 0.10 s
Wall time: 0.11 s
Out[10]:
f(x) == C1 + (C2 + C3*x - x**2/8)*sin(x) + (C4 + C5*x + x**2/8)*cos(x) + x**2

In [11]: time dsolve(f(x).diff(x, 5) + 2*f(x).diff(x, 3) + f(x).diff(x) - 2*x - sin(x) - cos(x), f(x), 'nth_linear_constant_coeff_variation_of_parameters')


The last one involves a particularly difficult Wronskian for SymPy (run it with hint=’nth_linear_constant_coeff_variation_of_parameters_Integral’, simplify=False).

Wall time comparisons reveal amazing speed differences. We’re talking orders of magnitude.

In [13]: 0.94/0.08
Out[13]: 11.75

In [14]: 5.52/0.06
Out[14]: 92.0

In [15]: oo/0.11
Out[15]: +inf


Of course, variation of parameters has the most difficult time when there are sin and cos terms involved, because of the poor trig simplification in SymPy. So let’s see what happens with an ODE that just has exponentials and polynomial terms involved.

In [16]: time dsolve(f(x).diff(x, 2) + f(x).diff(x) - x**2 - 2*x, f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.10 s, sys: 0.00 s, total: 0.10 s
Wall time: 0.10 s
Out[17]:
f(x) == C1 + x**3/3 + C2*exp(-x)

In [18]: time dsolve(f(x).diff(x, 2) + f(x).diff(x) - x**2 - 2*x, f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 0.19 s, sys: 0.00 s, total: 0.19 s
Wall time: 0.20 s
Out[19]:
f(x) == C1 + x**3/3 + C2*exp(-x)

In [20]: time dsolve(f(x).diff(x, 3) + 3*f(x).diff(x, 2) + 3*f(x).diff(x) + f(x) - 2*exp(-x) + x**2*exp(-x), f(x), hint='nth_linear_constant_coeff_undetermined_coefficients')
CPU times: user 0.09 s, sys: 0.00 s, total: 0.09 s
Wall time: 0.09 s
Out[21]:
f(x) == (C1 + C2*x + C3*x**2 + x**3/3 - x**5/60)*exp(-x)

In [22]: time dsolve(f(x).diff(x, 3) + 3*f(x).diff(x, 2) + 3*f(x).diff(x) + f(x) - 2*exp(-x) + x**2*exp(-x), f(x), hint='nth_linear_constant_coeff_variation_of_parameters')
CPU times: user 0.29 s, sys: 0.00 s, total: 0.29 s
Wall time: 0.29 s
Out[23]:
f(x) == (C1 + C2*x + C3*x**2 + x**3/3 - x**5/60)*exp(-x)


The wall time comparisons here are:

In [24]: 0.20/0.10
Out[24]: 2.0

In [25]: 0.29/0.09
Out[25]: 3.22222222222

So we don’t have orders of magnitude anymore, but it is still 2 to 3 times faster. Of course, most ODEs of this form will have sin or cos terms in them, so the order of magnitude improvement over variation of parameters can probably be attributed to undetermined coefficients in general.

Of course, we know that variation of parameters will still be useful, because functions like \ln{x}, \sec{x} and \frac{1}{x} do not have a finite number of linearly independent derivatives, and so you cannot apply the method of undetermined coefficients to them.

There is one last thing I want to mention. You can indeed multiply any polynomial, exponential, sin, or cos functions together and still get a function that has a finite number of linearly independent solutions, but if you multiply two or more of the trig functions, you have to apply the power reduction rules to the resulting function to get it in terms of sin and cos alone. Unfortunately, SymPy does not yet have a function that can do this, so to solve such a differential equation with undetermined coefficients (recommended, see above), you will have to apply them manually yourself. Also, just for the record, it doesn’t play well with exponentials in the form of sin’s and cos’s or the other way around (complex coefficients on the arguments), so you should back convert those first too.

Well, this concludes the first of two blog posts that I promised. I also promised that I would write about my summer of code experiences. Not only is this important to me, but it is a requirement. I really hope to get this done soon, but with classes, who knows.


Los Alamos “Sprint”

August 17, 2009

Last weekend, Luke came to visit Ondrej in Los Alamos, so I decided to drive him up from Albuquerque and visit him again. It was nice meeting Luke and seeing Ondrej again.

Aside from coding (the main thing that I did was fix an ugly match bug that was preventing dsolve() from recognizing certain ODEs), we visited the atomic museum in Los Alamos, the Valles Caldera, and some of the surrounding hot springs.

Here are some pictures that Luke took with his iPhone. Stupid WordPress seems to insist on flipping some of them (I can’t fix it):

This is one of three posts that I plan on doing this week. I just finished my GSoC project today/last night, so I will be blogging about that. I plan on doing a post on the method of Undetermined Coefficients, as well as some other things that I managed to do. The other post will be my general musings/advice for GSoC. That will probably be my last post here in a while. I plan on continuing work with SymPy, but I get very busy with classes, so I most likely won’t be doing much until next summer.


Testing implicit solutions to ODEs

August 12, 2009

So, the hard deadline for GSoC it Monday, so this will probably be my last post until then (I am very busy trying to finish up the ode module by then). But this is one of those things that you just have to blog about.

So I have this checksol function in test_ode.py that attempts to check it the solutions to odes are valid or not. It was a relic of the old ode module. For that, it would just substitute the solution into the ode and see if it simplified to 0. That is what it still does, if the solution is solved for f(x) (the function for all of my ode tests). But if the solution is implicit in f, either because solve() is not good enough to solve it or because it cannot be solved, then that method obviously does not work. So what I was trying to do is what my textbook suggested. Take the derivative of the solution implicitly n times, where n is the order of the ode, and see if that is equal to the ode. Basically, I was subtracting the ode from it and seeing if it reduced to 0.

However, it wasn’t really working at all for most of my implicit solutions, even the really simple ones. I ended up XFAILing most of my implicit checksol tests. I think every single homogeneous coefficients had an implicit solution, and none of them were working with checksol().

So I started to ask around on IRC to see if anyone had any better ideas for testing these. Ondrej couldn’t think of anything. Luke and Chris worked on an example that I gave them, and it seemed to be that it wasn’t correct (which I didn’t believe for a second, because the solution was straight out of my text, and both homogeneous coefficients integrals produced that same solution). It turns out that we were mixing up \log{\frac{y}{x}} and \log{\frac{x}{y}} terms. One of those appeared in the ode and the other appeared in the solution (the ode was y dx  + x\log{\frac{y}{x}}dy - 2x dy = 0 and the solution is \frac{y}{1 + \log{\frac{x}{y}}}=C, number 9 from my odes text, pg. 61.

So Chris had a novel idea. For 1st order odes, you can take the derivative of the solution and solve for \frac{dy}{dx}, which will always be possible, because differentiation is a linear operator. Then substitute that into the original ode, and it will reduce.

So we were talking about this on IRC later, and I had an epiphany as to why my original method wasn’t working. After trying it manually on an ode, I found that I had to multiply through the solution’s derivative by \frac{x}{f(x)} to make it equal to the ode. Then, that reminded me of an important solution method that I didn’t have time to implement this summer: integrating factors. I remember that my textbook had mentioned that there is a theorem that states that every 1st order ODE that is linear in the derivative has a unique integrating factor that makes it exact. And I realized, the derivative of the solution will be equal to the ODE if and only if the ODE is exact. I checked my exact tests and verified my hunch. I had to XFAIL all of my implicit homogeneous coefficients solutions, but all of my exact checksols were working just fine.

So I refactored my checksol function to do this, and it now can check almost every one of my failing checksols. The exceptions are some where trigsimp() cannot simplify the solution to 0 (we have a poor trigsimp), a second order solution (the above trick only works on 1st order odes, I believe), and some other simplification problems.

The only down side to this new routine is that it is kind of slow (because of the simplification). I am going to have to skip a test of only 6 solutions because it takes 24 seconds to complete.


Homogeneous coefficients corner case

August 10, 2009

Before I started the program, I implemented Bernoulli equations. But the general solution to Bernoulli equations involves raising something to the power of \frac{1}{1-n}, where n is the power of the dependent term (see the Wikipedia page for more info). This works great, as I soon discovered, unless n == 1. Then you get something to the power of \infty. So I had to go in and remove the corner case.

So you think that after that I would have been more careful after that about checking that if general solution that divides by something I would test to see if that something is not zero before returning it as a solution.

Well, as I was just trying to implement some separable equation tests, I was going through the exercises of my ode text as I usually do for tests, and I came across xy' - y = 0. If you recall, this equation also has coefficients that homogeneous of the same order (1). From the general solution to homogeneous coefficients, you would plug it into \int{\frac{dx}{x}}=\int{\frac{-Q(1,u)du}{P(1,u)+uQ(1,u)}}+C where u = \frac{y}{x} or \int{\frac{dy}{y}}=\int{\frac{-P(u,1)du}{uP(u,1)+Q(u,1)}}+C where u = \frac{x}{y} (here, P and Q are from the general form P(x,y)dx+Q(x,y)dy=0). Well, it turns out that if you plug the coefficients from my example into those equations, the denominator will become 0 for each one. So I (obviously) need to check for that P(1,u)+uQ(1,u) and uP(u,1)+Q(u,1) are not 0 before running the homogeneous coefficients solver on a differential equation.


Variation of Parameters and More

August 1, 2009

Well, the last time I posted a project update, I had resigned myself to writing a constant simplifying function and putting the Constant class on the shelf. Well, just as I suspected, it was hell writing it, but I eventually got it working. Already, what I have in dsolve() benefits from it. I had many solutions with things like \frac{x}{C_1} or -C_1x in them, and they are now automatically reduced to just C_1x. Of course, the disadvantage to this, as I mentioned in the other post, is that it will only simplify once. Also, I wrote the function very specifically for expressions returned by dsolve. It only works, for example, with constants named sequentially like C1, C2, C3 and so on. Even with making it specialized, it was still hell to write. I was also able to get it to renumber the constants, so something like C2*sin(x) + C1*cos(x) would get transfered to C1*sin(x) + C2*cos(x). It uses Basic._compare_pretty() (thanks to Andy for that tip), so it will always number the constants in the order they are printed.

Once I got that working, it was just little work to finish up what I had already started with solving general linear homogeneous odes (a_ny^{(n)} + a_{n-1}y^{(n-1)} + \dots + a_2y'' + a_1y' + a_0y = 0 with a_i constant for all i). Solving these equations is easy. You just set up a polynomial of the form a_nm^n + a_{n-1}m^{n-1} + \cdots + a_2m^2 + a_1m + a_0 = 0 and find the roots of it. Then you plug the roots into an exponential times x^i for i from 1 to the multiplicity of the root (as in Cx^ie^{root \cdot x}). You usually expand the real and complex parts of the root using Euler’s Formula, and, once you simplify the constants, you get something like x^ie^{realpart \cdot x}(C_1\sin{(impart \cdot x)} + C_2\cos{(impart \cdot x)}) for each i from 1 to the multiplicity of the root. Anyway, with the new constantsimp() routine, I was able to set this whole thing up as one step, because if the imaginary part is 0, then the two constants will be simplified into each other. Also, SymPy has some good polynomial solving, so I didn’t have any problems there. I even made good use of the collect() function to factor out common terms, so you get something like (C_1 + C_2x)e^{x} instead of C_1e^{x} + C_2xe^{x}, which for larger order solutions, can make the solution much easier to read (compare for example, ((C_1 + C_2x)\sin{x} + (C_3 + C_4x)\cos{x})e^{x} with the expanded form, C_1e^{x}\sin{x} + C_2xe^{x}\sin{x} + C_3\cos{x}e^{x} + C_4x\cos{x}{e^x} as the solution to {\frac {d^{4}}{d{x}^{4}}}f \left( x \right) -4\,{\frac {d^{3}}{d{x}^{3}}}f \left( x \right) +8\,{\frac {d^{2}}{d{x}^{2}}}f \left( x \right) -8\,{\frac {d}{dx}}f \left( x \right) +4\,f \left( x \right) =0).

I entered all 30 examples from the relevant chapter of my text (Ordinary Differential Equations by Morris Tenenbaum and Harry Pollard), and the whole thing runs in under 2 seconds on my machine. So it is fast, though that is mostly due to fast polynomial solving in SymPy.

So once I got that working well, I started implementing variation of parameters, which is a general method for solving all equations of form a_ny^{(n)} + a_{n-1}y^{(n-1)} + \dots + a_2y'' + a_1y' + a_0y = F(x). The method will set up an integral to represent the particular solution to any equation of this form, assuming that you have all n linearly independent solutions to the homogeneous equation a_ny^{(n)} + a_{n-1}y^{(n-1)} + \dots + a_2y'' + a_1y' + a_0y = 0. The coefficients a_i do not even have to be constant for this method to work, although they do have to be in my implantation because otherwise it will not be able to find general solution to the homogeneous equation.

So, aside from doing my GSoC project this summer, I am also learning Linear Algebra, because I could not fit it in to my schedule next semester and I need to know it for my Knot Theory class. It turns out that it was very useful in learning the method of variation of parameters. I will explain how the method works below, but first I have a little rant.

Why is the Wikipedia article on variation of parameters the only website anywhere that covers variation of parameters in the general case? Every other site that I could find only covers 2nd order equations, which I understand is what is taught in most courses because applying it to anything higher can be tedious and deriving the nth order case requires knowledge of Cramer’s Rule, which many students may not know. But you would think that there would at least be sites that discuss what I am about to discuss below, namely, applying it to the general case of an nth order inhomogeneous linear ode. Even the Wolphram MathWorld article only explains the derivation for a second order linear ODE, mentioning at the bottom that it can be applied to nth order linear ODEs. I did find a website called Planet Math that covers the general case, but it wasn’t on the top of the Google results list and took some digging to find. It also has problems of its own, like being on a very slow server and some of the LaTeX on the page not rendering among them.

This partially annoys me because the Wikipedia article is not very well written. You have to read through it several times to understand the derivation (I will try to be better below). The Planet Math site is a little better, but like I said, it took some digging to find, and I actually found it after I had written up half of this post already.

But it is also part of a larger attitude that I am finding more and more of where anything that is not likely to be directly applied is not worth knowing and thus not worth teaching. Sure, it is not likely that any person doing hand calculations will ever attempt variation of parameters on an ode of order higher than 2 or 3, but that is what computer algebra systems like SymPy are for. Unfortunately, it seems that they are also in a large part for allowing you to not know how or why something mathematically is true. What difference does it make if variation of parameters can be applied to a 5th order ODE if I have to use Maple to do actually do it anyway. As long as the makers of Maple know how to apply variation of parameters to a nth order ODE, I can get along just fine. At least with SymPy, the source is freely available, so anyone who does desire to know how things are working can easily see. Anyway, I am done ranting now, so if you were skipping that part, this would be the point to start reading again.

So you have your linear inhomogeneous ODE: a_ny^{(n)} + a_{n-1}y^{(n-1)} + \dots + a_2y'' + a_1y' + a_0y = F(x). a_n cannot be zero (otherwise it would be a n-1 order ODE), so we can and should divide through by it. Lets pretend that we already did that, and just use the same letters. Also, I will rewrite a_n as a_n(x) to emphasize that the coefficients do not have to be constants for this to work. So you have your linear inhomogeneous ODE: y^{(n)} + a_{n-1}(x)y^{(n-1)} + \dots + a_2(x)y'' + a_1(x)y' + a_0(x)y = F(x). So, as I mentioned above, we need n linearly independent solutions to the homogeneous equation y^{(n)} + a_{n-1}(x)y^{(n-1)} + \dots + a_2(x)y'' + a_1(x)y' + a_0(x)y = 0 to use this method. Let us call those solutions y_1(x), y_2(x), \dots, y_n(x). Now let us write our particular solution as y_p(x) = c_1(x)y_1(x), c_2(x)y_2(x), \dots, c_n(x)y_n(x). Now, if we substitute our particular solution in to the left hand side of our ODE, we should get F(x) back. So we have (y_p)^{(n)} + a_{n-1}(x)(y_p)^{(n-1)} + \dots + a_2(x)y_p'' + a_1(x)y_p' + a_0(x)y_p = F(x). Now, let me rewrite y_p as a summation to help keep things from getting too messy. I am also going to write c_i instead of c_i(x) on terms for additional sanity. Every variable is a function of x. y_p(x) = \sum_{i=1}^{n} c_i y_i. The particular solution should satisfy the condition of the ODE, so
y_p^{(n)} + a_{n-1}y_p^{(n-1)} + \dots + a_2y_p'' + a_1y_p' + a_0y_p = F(x).

(\sum_{i=1}^{n} c_i y_i)^{(n)} + a_{n-1}(\sum_{i=1}^{n} c_i y_i)^{(n-1)} + \dots + a_2(\sum_{i=1}^{n} c_i y_i)^{(2)} +
a_1(\sum_{i=1}^{n} c_i y_i)^{(1)} + a_0\sum_{i=1}^{n} c_i y_i = F(x).

Now, if we apply the product rule to this, things will get ugly really fast, because we have to apply the product rule on each term as many times as the order of that term (the first term would have to be applied n times, the second, n-1 times, and so on). But there is a trick that we can use. In the homogeneous case, there is no particular solution, so in that case the c_i terms must all vanish identically because the solutions are linearly independent of one another. Thus, if we plug the particular solution into the homogeneous case, we get

(\sum_{i=1}^{n} c_i y_i)^{(n)} + a_{n-1}(\sum_{i=1}^{n} c_i y_i)^{(n-1)} + \dots + a_2(\sum_{i=1}^{n} c_i y_i)^{(2)} +
a_1(\sum_{i=1}^{n} c_i y_i)^{(1)} + a_0\sum_{i=1}^{n} c_i y_i = 0.

We already know that if we plug the y_i terms in individually of the c_i terms, that the expression will vanish identically because the y_i terms are solutions to the homogeneous equation. The product rule on each term will be evaluated according to the Leibniz Rule, which is that (c_i \cdot f_i)^{(n)}=\sum_{k=0}^n {n \choose k} c_i^{(k)} y_i(x)^{(n-k)}. Now the c_i y_i^{(n)} terms will vanish because we can factor out a c_i and they will be exactly the homogeneous solution. Because the expression is identically equal to zero, the remaining terms must vanish as well. If we assume that each \sum_{i=1}^n c_i' y_i^{(j)}=0 for each j from 0 to n-2, then this will take care of this; the terms with higher derivatives on c_i will also be 0, if this is true, though we do not need them for our derivation. In other words,
 c_1' y_1  + c_2' y_2 + \cdots + c_n' y_n = 0
c_n' y_1' + c_n' y_2' + \cdots + c_n' y_n' = 0
\vdots
c_n' y_1^{(n-2)} + c_n' y_2^{(n-2)} + \cdots + c_n' y_n^{(n-2)} = 0.

So, turning back to our original ODE with the particular solution substituted in, we have
(\sum_{i=1}^{n} c_i y_i)^{(n)} + a_{n-1}(\sum_{i=1}^{n} c_i y_i)^{(n-1)} + \dots + a_2(\sum_{i=1}^{n} c_i y_i)^{(2)} +
a_1(\sum_{i=1}^{n} c_i y_i)^{(1)} + a_0\sum_{i=1}^{n} c_i y_i = F(x).
But we know that most of the terms of this will vanish, from our assumption above. If we remove those terms, what remains is \sum_{i=1}^{n} c_i' y_i^{(n-1)} = F(x). So this is where it is nice that I learned Cramer’s Rule literally days before learning how to do Variation of Parameters in the general case. We have a system of n equations (the n-1 from above, plus the one we just derived), of n unknowns (the c_i terms). The determinant that we use here is used often enough to warrant a name: the Wronskian. We have that c_i' = \frac{W_i(x)}{W(x)}, or c_i = \int \frac{W_i(x)}{W(x)}, where W_i(x) is the Wronskian of the fundamental system with the ith column replaced with \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ F(x) \end{bmatrix}. So we finally have y_p = \sum_{i=1}^n \int \frac{W_i(x)}{W(x)} y_i.

Well, that’s the theory, but as always here, that is only half of the story. A Wronskian function is already implemented in SymPy, and finding W_i(x) simply amounts to F(x) times the Wronskian of the system without the ith equation, all times (-1)^i. So implementing it was easy enough. But it soon became clear that there would be some problems with this method. Sometimes, the SymPy would return a really simple Wronskian, something like -4e^{2x}, but other times, it would return something crazy. For example, consider the expression that I reported in SymPy issue 1562. The expression is (thanks to SymPy’s latex() command, no thanks to WordPress’s stupid auto line breaks that have forced me to upload my own image. If it wasn’t such a pain, I would do it for every equation, because it looks much nicer.):
Crazy Trig Wronskian (SymPy).
This is the Wronskian, as calculated by SymPy’s wronskian() function, of
\begin{bmatrix}x \sin{x}, & \sin{x}, & 1, & x \cos{x}, & \cos{x}\end{bmatrix}, which is the set of linearly independent solutions to the ODE {\frac {d^{5}}{d{x}^{5}}}f \left( x \right) +2\,{\frac {d^{3}}{d{x}^{3}}}f \left( x \right) +{\frac {d}{dx}}f \left( x \right) -1. Well, the problem here is that, as verified by Maple, that complex Wronskian above is identically equal to -4. SymPy’s simplify() and trigsimp() functions are not advanced enough to handle it. It turns out that in this case, the problem is that SymPy’s cancel() and factor() routines do not work unless the expression has only symbols in it, and that expression requires you to cancel and factor to find the \cos^2{x} + \sin^2{x} (see the issue page for more information on this). Unfortunately, SymPy’s integrate() cannot handle that unsimplified expression in the denominator of something, as you could imagine, and it seems like almost every time that sin’s and cos’s are part of the solution to the homogeneous equation, the Wronskian becomes too difficult for SymPy to simplify. So, while I was hoping to slip along with only variation of parameters, which technically solves every linear inhomogeneous ODE, it looks like I am going to have to implement the method of undetermined coefficients. Variation of parameters will still be useful, as undetermined coefficients only works if the expression on the right hand side of the equation, F(x) has a finite set of linearly independent derivatives (such as sin, cos, exp, polynomial terms, and combinations of them (I’ll talk more about this whenever I implement it).

The good news here is that I discovered that I was wrong. I had previously believed that among the second order special cases were cases that could only be handled by variation of parameters or undetermined coefficients, but it turns out I was wrong. All that was implemented were the homogeneous cases for second order linear with constant coefficients. In addition to this, there was one very special case ODE that Ondrej had implemented for an example (examples/advanced/relativity.py). The ODE is
-2({\frac{d}{dx}}f(x)){e^{-f(x)}}+x({\frac{d}{dx}}f(x))^{2}{e^{-f(x)}}-x({\frac{d^{2}}{d{x}^{2}}}f(x)){e^{-f(x)}}, which is the second derivative of xe^{-f(x)} with respect to x. According to the example file, it is know as Einstein’s equations. Maple has a nice odeadvisor() function similar to the classify_ode() function I am writing for SymPy that tells you all of the different ways that an ODE can be solved. So, I plugged that ODE into it and got a few possible methods out that I could potentially implement in SymPy to maintain compatibility with the example equation. The chief one is that the lowest order of f in the ODE is 1 (assuming you divide out the e^{-f(x)} term, which is perfectly reasonable as that term will never be 0. You can then make the substitution u = f'(x), and you will reduce the order of the ODE to first order, which in this case would be a Bernoulli equation, the first thing that I ever implemented in SymPy.

But I didn’t do that. Reduction of order methods would be great to have for dsolve(), but that is a project for another summer. Aside from that method, Maple’s odeadvisor() also told me that it was a Liouville ODE. I had never heard of that method, and neither it seems has Wikipedia or “Uncle Google” (as Ondrej calls it). Fortunately, Maple’s Documentation has a nice page for each type of ODE returned by odeadvisor(), so I was able to learn the method. The method relies on Lie Symmetries and exact second order equations, neither of which I am actually familiar with, so I will not attempt to prove anything here. Suffice it to say that if an ODE has the form
{\frac{d^{2}}{d{x}^{2}}}y(x)+g(y(x))({\frac{d}{dx}}y(x))^{2}+f(x){\frac{d}{dx}}y(x)=0, then the solution to the ODE is
\int^{y(x)}{e^{\int g(a){da}}}{da}+C1\int{e^{-\int f(x){dx}}}{dx}+C2=0
You could probably verify this by substituting the solution into the original ODE. See the Maple Documentation page on Liouville ODEs, as well as the paper they reference (Goldstein and Braun, “Advanced Methods for the Solution of Differential Equations”, see pg. 98).

The solution is very straight forward–as much so as first order linear or Bernoulli equations, so it was a cinch to implement it. It looks like quite a few differential equations generated by doing F''(y(x), x) for some function or x and y F(y(x), x) generates equations of that type, so it could be actually useful for solving other things.

Before I sign off, I just want to mention one other thing that I implemented. I wanted my linear homogeneous constant coefficient ODE solver to be able to handle ODEs for which SymPy can’t solve the characteristic equation, for whatever reason. SymPy has RootOf() objects similar to Maple that let you represent the roots of a polynomial without actually solving it, or even being able to solve it, but a you can only use RootOf’s if you know that none of the roots are repeated. Otherwise, you would have to know which terms require an additional x^i to preserve linear independence. Well, it turns out that there is a way to tell if a polynomial has repeated roots without solving for them. There is a number associated with every polynomial of one variable called the discriminant. For example, the discriminant of the common quadratic polynomial ax^2 + bx + c is the term under the square root of the famous solution b^2 - 4ac. It is clear that a quadratic has repeated roots if and only if the discriminant is 0. Well, the same is true for the discriminant of any polynomial. I am not highly familiar with this (ask me again after I have taken my abstract algebra class next semester), but apparently there is something called the resultant, which is the product of the differences of roots between two polynomials and which can also be calculated without explicitly finding the roots of the polynomials. Clearly, this will be 0 if and only if the two polynomials share a root. So the discriminant is built from the fact that a polynomial has a repeated root iff it shares a root with its resultant. So it is basically the resultant of a polynomial and its derativave, times an extra factor. It is 0 if and only if the polynomial has a repeated root.

Fortunately, SymPy’s excelent Polys module already had resultants implemented (quite efficiently too, I might add), so it was easy to implement the discriminant. I added it as issue 1555. If you are a SymPy developer and you have somehow managed to make yourself read this far (bless your heart), please review that patch.

Well, this has turned out to be one hella long blog post. But what can I say. You don’t have to read this thing (except for possibly my mentor. Sorry Andy). And I haven’t been quite updating weekly like I am supposed to be, so this compensates. If you happened upon this blog post because, like me, you were looking for a general treatment of variation of parameters, I hope you found my little write up helpful. And if you did, and you now understand it, could you go ahead and improve the Wikipedia article. I’m not up to it.


Modifying a list while looping through it in Python

July 20, 2009

Here is an interesting thing that I found. x is a SymPy Symbol:
Code block 1

I would have expected to get a = [], but it only removes the first item. And yes, x + 1 passes the condition:

>>> (x + 1).has(x)
True

Clearly, it is a bad idea to modify a list while I am looping through it. I should instead be doing something like this:
Code block 2
But I am intrigued as to why exactly this fails. If any of the readers of the blog thinks that he know, please post in the comments. Or, if I figure it out, I will post an update. Also, here is a similar example, with a strange result:
Code block 3

(Sorry for the images by the way. WordPress’s so called “code” blocks are impervious to indentation.)

UPDATE (a few minutes later):
Well, I figured it would come to me as to why this was happening, and it didn’t take long. While I haven’t read the actual Python Language Reference, this is what I am assuming is happening. This is all just my guessing on how Python is implemented. Please correct me if I am wrong.

So, obviously, a Python list is just a C array. It is probably an array of pointers, which is the only way I can see that would let it be mutable with different objects (this is how compiled languages with dynamic typing like Objective-C more or less pull it off). Now C does not have the for i in list syntax that Python has (nor does any other language that I know of. That is one of the reasons that Python is so awesome!), so if you want to recurse a list (C array), you have to do the usual for (i=0; i<=len(list); i++) { from C (or it probably uses a while loop, which would allow for things like iterators, but a for loop in C is literally just a wrapper around a while loop anyway). Then of course, inside of the loop, you just have list[i] blocks. So when I was going through my list, for example, the list of numbers in the last example, it would hit item 0 (the first item), remove it, which would amount to rebuilding the list as [2, 3, 4, 5], then it would hit item 1, which is now 3, remove it, rebuilding the list, and so on. So the even numbered elements remain because it skips every element after one that it removes. CPython must have good error handling, because eventually this would cause the indices to go beyond the length of the list. It seems to me that this behavior is not very well defined. Personally, I think that whatever you are looping through in a for loop should become immutable within the loop block. I checked Python 3.1, and the behavior is exactly the same.

Based on this, .remove() rebuilds the list each time. I would have thought it would just set the value in the array to Null, but I guess that would make it more difficult to test equality with an equivalent list that doesn’t have Null values. It is good to know that .remove() does that, because it means that can be an expensive operation.


Constant stuff

July 16, 2009

So I was able to get a working version of the Constant class, but because the code cluttered up the internal Add and Mul classes too much, Ondrej convinced me that to make a function that does the simplification instead, and I reluctantly agreed. After begining work on it, I realized that it will be much easier to make it just an internal function that handles the special cases presented by dsolve(). That means that it will only handle arbitrary constants that are independent of one variable, and it will only work with constants that are named as “C1″, “C2″, and so on.

If we ever get the sympyx core that Ondrej and I worked on when I was in Los Alamos in, it will be easy for my to use a Constant class, because it will have handler logic that will allow for the Constant class to exist independent of Add and Mul. It already can exist independent of Pow with a minor code addition, but simplifying powers is easier than simplifying addition and multiplcation because exponentiation is neither commutative nor associative, meaning that you don’t have to worry about absorbing stuff on the other side of something, like 2 + x + C.

See my constant-Mul branch for my working version of a Constant class the implements in Mul and Add. See my constant-function branch for my work on the internal function.

Because I have decided to make thing simple and make the function internal only, I should have things up and running soon. Then, it will be simple to fix up my nth order homogeneous stuff that I already have so that it works, and then to implement variation of parameters!


Meeting Ondrej in Los Alamos

July 13, 2009

So Ondrej was kind enough to have me over to his house in Los Alamos for Friday and Saturday. We spent a lot of time coding together. While we worked on some other things too, we mainly worked on my constant class. Ondrej and I came up with an idea for restructured Mul and Add classes that would allow different objects in them to handle the other objects in them. The way it is now, if I want my Constant object to absorb other objects, like 2*a*C*x => C*x, I have to hardcode it into Mul. The same is true with Add. This makes the Mul and Add classes muddy. Right now, there is already special handling for other such dynamic objects such as Order classes (e.g., O(x)), and Infinity class. See the Add.flatten() and Mul.flatten() methods in sympy/core/add.py and sympy.core/mul.py to see what I mean.

We came up with a system where symbols and numbers are handled the same way, because we need them to be fast, but if an expression has an object that has a handle_mul() method, it will call that method with the other objects in the expression in the Mul/Add and the object will take care of the special handling. Ondrej was able to get most of it working in his experimental core that doesn’t use assumptions here. We will hopefully end up using it, but we need to wait until we merge the new assumptions system.

So in the mean while, I have a working Constant class that modifies Mul and Add here. Since it will take a while until the new assumptions system is done (Fabian Seoane is doing it for his Google Summer of Code project), we may end up temporarily adding in my Constant branch. Once I have a working Constant class, I can solve homogeneous differential equations (not to be confused with first order differential equations with homogeneous coefficients) with one case (there are several cases depending on whether the roots of the so called characteristic equation are real, imaginary, or complex, but we can actually handle them in one case if we have constants that combine into each other. More on this in a later post). Once I have that (which I have everything already except for the arbitrary constants), I can then implement variation of parameters, which, along with separable, will probably be the most used solver of the ones that I will implement this summer.

Once I get those, I can clean up a lot of the 2nd order differential equation code in dsolve (currently it is just a hack with a bunch of special cases all covered by variation of parameters). With that code cleaned up, I can refactor dsolve to use my proposed hints engine, which will allow the user to choose which methods they want to use to solve an equation (more on that in a later post too).


  • Archives