Flying Low

Standard

This week’s post is inspired by conversations with my friend @rdelcueto and by watching the excellent Jeff and Casey show on Youtube.

Being close to the machine is a very desirable property for platforms. It is not unique to C and C++. Lisp had it in the 80s. It still has it now but nobody uses Lisp unless it is Clojure.

The urban dictionary defines “Flying Low” as a euphemism for having your zipper down, but that’s not what I have in mind.

A platform that flies low is a platform that keeps the distance between the programmer and the CPU short.

A formal-esque definition:

Flying low is the property of having a small number of easily understandable layers of abstraction between the platform and the machine.

Continue reading

pyCave: Porting to mac.

Standard

So for the past couple of days I have been porting pyCave to my mac. Yesterday was the day I thought the work was pretty much finished (It’s not).

I did it by starting with a GL demo written from scratch, and I went up from there adding features until I had a working copy of pyCave. Obviously, there was a lot of copy-pasting, but the overall writing involved a lot of redesign that ended up touching the whole program. The diff file was huge.

This wasn’t the original plan. The ‘project’ was called ‘glui’. It was only a module to build GL user interfaces. It eventually morphed into a new version of pycave =). There was only a single commit. Bad idea. But the code is the product of an intense non-stop coding session.

One of the first changes was adding an opengl extension checking mechanism. I found a nifty thing that pyOpenGL had. Every extension module has a member called EXTENSION_NAME. I used that to check for extensions in a very concise manner.
Little did I know that the nifty feature is actually only featured in the latest releases of pyOpenGL 3 (ubuntu beta releases don ‘t have it).

So i’m thinking that I’ll include a subset of the latest pyOpenGL (3.0.0) so that I won’t get emails from ubuntu users with Nvidia cards saying that the game doesn’t run.

The game runs almost perfectly on my Macbook with Python2.6. I was going to write about the hell I went through compiling pygame for Python2.6 on a mac (it’s practically impossible), but it’s a long story. I had to do it because I had screwed up the default Leopard python instalation thanks to a very stupid bash command.
Today I fixed my pretty apple-provided Python and tried to run pyCave (practically all my users will do the same).

It actually crashes.

The 2.5 and 2.6 versions have the same PyOpengl (release candidate 1). It is a very strange bug and I think my frankenstein python2.5 instalation is to blame. I am not sure.
A friend of mine just got a brand new Macbook (lower-end) and I would love for him to let me borrow it for a half hour to do some testing.

I am very optimistic that my 6 year old laptop will be able to run the game. It already does, but there is an odd bug in which all the edges are highlighted. It looks awesome with the proceduraly generated tunnel =) It also has the extension-checking bug, which should be fixed by adding the OpenGL module inside the pyCave distribution.

Lisp after Python after Lisp.

Standard

(Or, Python vs Clojure rant)

This post assumes some familiarity with Lisp and Python from the reader.

Like loads of programmers, I have read most of Paul Graham’s essays, and I always finish them with that feeling that if I’m not using Lisp then I must be an idiot. Like most programmers, I certainly don’t want to be an idiot, so I went ahead and learned Lisp, Common Lisp.

At first, I was horrified by all those damned parentheses. Still, I kept at it until I was able to write code half-decently. I didn’t get very far, and truly diving into Lisp was buried deep into my to-do list.

I didn’t really got Lisp until I learned about Clojure.
Clojure is a new functional Lisp that runs on the JVM. It is a great language with brilliant design and it has the potential to bring some great research ideas into the mainstream.

Learning Clojure has gotten me to really appreciate Lisp for what it is.
Some months ago I wrote about how I loved LINQ. With Lisp, I can’t complain if I don’t have LINQ, I can just write a macro to implement it.
I understand that when I’m writing Lisp code, what I am writing is an abstract syntax tree (AST) in human-readable form. You can see code as the very data structure that represents your program, and you can modify it as such. That is why Lisp must be just the perfect language for Genetic Programming, where code needs to modify itself.

I love the fact that in Lisp everything is an s-expression. S-expressions have an advantage that I hadn’t anticipated. They make editing easier in an editor like Emacs where you have commands that explicitly handle sexps.
Editing Lisp inside Emacs, after a while, is a very enjoyable experience. You can transpose,kill,navigate through and mark sexps as if they were letters or words. And since everything in Lisp is an s-expresion, it becomes very malleable.

So, after having my Lisp epiphany, I started seeing Python through a new light. I saw that Python was just like Lisp in so many ways, yet it was different in many others.

Python is dynamic and has a REPL. Both are huge advantages that Lisp has had for decades.
In Python, there is a difference between statements and expressions. This was my first big no-no on my comeback to Python.
I would like to write something like this:

a = if (x == 2):
           "Hello"
    else:
           "Goodbye"


Lisp code is elegant, and Lisp as a language is mathematically beautiful. It is also true that once you get used to all the parentheses you start not to notice them. However, Lisp is just not as pretty as Python. There’s only so much you can do when your language requires the programmer to write the AST directly.

Python has the most beautiful syntax I have yet to see. Letting whitespace have syntactic meaning is a great design decision. The resulting code is indented the way people should indent their C-style programs anyway.
I have found downsides to this approach. In programs with lots of consecutive, horrible OpenGL API calls, things would look prettier if the language would let you indent at will. 99% of the time, however, it is a feature rather than a bug.

Every programmer has some personal pseudocode. This pseudolanguage is the language we think in. Python is as close to most programmers’ pseudocode as you can get. I get this warm, fuzzy feeling when I code in Python. Its philosophy dictates that programming is so hard that the language should really just get out of the way of the programmer.

Programming in Clojure has definitely changed the way I program. More thanks to the fact that it is functional than to the fact that it is a Lisp. After you pass the brain-freeze that is inevitable when dealing with lack of state, you reach a point of enlightenment. That is that you realize that programming functionally frees you from worrying about post-conditions and pre-conditions. It is such a nice feeling when you know that a function isn’t really changing the world, that it will change the way you code in other languages. Being stateless lets you stop worrying about a huge set of problems. It makes you smarter, since you are keeping less things in your head, and it reduces the bugs you can create. I am not saying you’ll absolutely loathe state after spending time with a functional language. I love a for loop as much as the next dude, I am saying that you will avoid state when you can. I agree with Tim Sweeney in that the functional paradigm should be the default. And that modifying state should be made explicit. I think this concept is present in Clojure’s STM approach, although it hasn’t really been proven effective in huge, complex applications.

I still prefer to code in Python than to code in Clojure, just like I still prefer to speak in Spanish, my native language, than to speak in French.

When I code in Lisp, I write pseudo-code that is pretty similar to Python and then I translate to Lisp as I write. Yet now, when I’m writing in Python, I often say “This would could be easier to express with Clojure”. I would like to know if my pseudolanguage is a product of having been exposed to imperative programming all these years. I don’t know if there are programmers out there whose pseudo-language is functional. Maybe it’s just human nature to think imperatively.
If my pseudolanguage ever starts to look like Clojure, I guess I’ll have my answer

Word Challenge cheating program

Standard

I have been playing Word Challenge on facebook, it is a very fun game. It made me start playing with anagrams, and I decided to write an anagram-solving program. However, I read about Donald Knuth’s insight that if you sort a word and its anagram alphabetically, it will give the same word.
Example:
sergio -> egiors
orgies -> egiors
That means ‘orgies’ is an anagram of my name =P.
So, using a hash table it is pretty easy to find anagrams in constant time. You just have to associate a word made up of ordered letters to a list of all the possible anagrams. You can do it in O(n) by iterating through a dictionary file. There’s no problem to solve when there is already an optimal solution.
I then decided to write a program that finds every possible word that can be formed by a group of letters (Which would be perfect to use if I ever wanted to cheat on Word Challenge. Which of course I never would! =P) .

I wrote an algorithm on top of Knuth’s idea that is stupidly slow.
It takes n letters, and sorts them. It then finds all the anagrams associated to that sorted word and adds them to a set. Then it recurses to each possible word created by removing one letter, until it reaches the minimum number of letters. Which is, by default, 3.

I haven’t proven it, but if my math is right, my algorithm is O( (n^2)(n-1)! ).
It works instantly on my machine when n<9.

http://sites.google.com/site/bigmonachus/Home/wc-py
It reads any dictionary file that is composed of single lines containing words on some language.
It is written in English, but there is some Spanish output. (You can probably figure out what it says.)

I was thinking of not uploading it, but there are probably already a lot of web applets that do the exact same thing, and I guess your average cheater wouldn’t go through the trouble of running a Python script if he/she can’t go through the trouble of practicing to get better at the game.

There are some obvious optimizations that could be made, like taking into account repeated letters.
There is also the option to change the data structure: First, load the dictionary to a list. Then order that list by the size of the words (from short to long).
For each word, create a key and append that word to the hash map with that key.
Create n sub-keys, where n is the number of letters in that key. Each sub-key has a letter removed. Since the key is sorted, each sub-key is also sorted. That means it is a valid key. Since it is shorter, it has already been added to the hash map (unless it doesn’t have any anagrams or it is below our limit. In those cases, we ignore it). We add the values associated to these sub-keys to the value of the hash-map for the current key.
We are using dynamic programming, since we are eliminating the need for recursion by saving state. I don’t know if I explained myself right..
There is no implementation for this idea since I came up with it as I was writing it =D. The new data structure would allow us to get all possible words for n letters in constant time!. And that structure can probably be filled pretty fast.

I’ll edit this post when I have a working implementation =D

Update: New implementation.

I implemented the idea I wrote about, and it worked! (Sort of..)
First off, it takes a lot more memory. I have a Spanish dictionary with around 80,000 words, and my program is consuming about 177 MB of memory. This can be hugely improved by using indexes instead of storing strings.
The good news is that if you input any word to the program, and that word or any of its anagrams are in my dictionary, the program outputs all the possible words that can be formed by those letters in constant time.
The bad news is that if you input any word whose key is not in the hash table, the program can’t tell you anything.
The solution to this was to check for this problem, and in those cases, iterate recursively through the possible sub-keys until we find the ones that are in the hash map.
This solution will cause the program to be extremely slow when you input large strings whose keys are not in the hash table.
Link to the new version.
Again, there are a lot of things that can still be improved. But this version works better than the last one.

I had to post this

Standard

It’s only been a couple of hours since the last post… but this is cool:

Lighting works:

The problem was that the python image library needed integers for the color value. Now it’s using opengl (just  to draw points), where there is a color function that takes r,b,g float parameters.

It’s starting to look nice! =) And adding reflections is going to be pretty trivial..

===Edit:
I forgot to mention that I managed to make it quite faster by removing a lot of calls to the Python C API. (i.e. defining C structures and using them instead of python tuples)
Now it is running at 1.08 the speed of the original C module. That means that I traded 8% performance for making the code less than half the size and much more easier to maintain… decent trade-off

Switched to cython

Standard

I had to spend a couple of hours today rewriting some code after reading about Cython.
I rewrote the C module to a Cython file. It runs at about 1.33 the speed of the C module, but I believe that’s mainly because I’m using tuples all over the place instead of C structs. My guess is that the decrease in speed would become very moderate after optimizing. The new module also got rid of a bug in the C module where objects would disappear when being too far from the viewing plane.
Even if after optimizing there isn’t a huge performance gain, I think the penalty would be well worth it. The code is less than half the length and didn’t take long to write..

Here is an image:


Again, two identical, separated spheres.
There is also some primitive lighting. Note the “banding”… It is caused because the program is rounding floats to ints. Shouldn’t be very tough to fix.

More ray tracing

Standard

Yesterday I had free time for the first time in weeks!! So I worked a lot on the ray tracer…
All data structures are Python objects. However, all the ray calculation is currently done by a module written in C.
Something that is really cool, is that at the innermost loop of the ray calculation, I have a call to a python function, named “shader”.  That call isn’t really that expensive and it allows me to run python code at the main loop!! Not only that, the objects (for now only spheres) actually have a reference to the shader they want to use. So I can call basically any python function if I plug it into one of the spheres =)

For now, there is only one loop. The shaders will be in charge of running new loops to calculate lighting, make shadows, reflections, etc.. But for now they only fill the pixel with a determined color.

Here is a 600×300 image that took only 0.079 seconds to draw. It is noteworthy that most of that time was spent drawing pixels with the Python Image Library. A faster library will improve it a lot, and adding support for multiple processors will be easy.


They are two spheres of the same size, the green one is farther away from the eye. (The last image I posted had two spheres of different sizes with no perspective)

Python ray tracing

Standard

I haven’t really had time to do anything significant, but I have written an extremely primitive ray tracer in python.
It may seem like a stupid idea to make such a computing intensive task in Python, but that’s part of the point… to optimize everything and get some decent speeds. The plan is to use numpy arrays and some inlined C code (with scipy.weave) to optimize the bottlenecks, but to really know where the bottlenecks are I have first to write a working ray tracer. However, in order to do that I need free time, and that’s something I don’t have right now (says he who is writing on his journal), and the free time I have had I have been spending stupidly deliberating on language choice..
Before 23:59 tonight, I have to turn in a little programming homework, to make an blur image filter. It’s a pretty interesting homework for a first programming course, and we have had projects like that all semester.
I am taking a seminar on Mathematics for Computer Graphics, and for the final project I’ll be writing a curve (splines) and surface (nurbs) renderer from scrath. So I will be posting that here =)

==Edit:

I forgot, here is a little dump from the raytracer:

Two spheres without any kind of lighting effect =P