Sunday, February 1, 2009

UCT Algorithm Circle

After much grinding away, we had our first class of the UCT Algorithm Circle this past Thursday. We invited 32 of the most talented school kids we could find in the Cape Town area and invited them to some training. We're also slowly inviting kids outside of Cape Town to train online. We're teaching them from the very basics of programming right through to the advanced algorithms and data structures required for the IOI.

For the first class, we introduced the basics of Python. We were amazed at how quickly the kids caught on. After a 20 minute lecture and 60 minute practical session they were understanding operations, variables, stdio and more. The majority of these kids are in grades 9 and 10, and amazingly half are girls.

If things continue at the rate they're going now, this could provide a serious boost to our IOI results in upcoming years. We've always been welcome to the idea of training a wider audience, but finding the talented kids and getting them interested has always been a brick wall we couldn't knock down. This time though, collaboration with one of the people involved in the teaching kids for the IMO has seriously helped change all that.

To see the kids we have, just check out some of their introductions in this thread to see what they're capable of.

Prototype

After completing the meat of my background chapter in December, I spent the most part of January working on a prototype for my masters project. So now I get to start showing off all my pretty pictures. :)

First of all, I should mention that I am writing an extension for VMD, so I most certainly did not develop what you see below from the ground up. In an effort to simplify the process of porting my work to other molecular visualisation applications (e.g. PyMol), I decided to do all the core computation in an application-independent C++ module which communicates with an application-specific plugin via sockets. For VMD, this plugin is written in Tcl, which I have come to hate.

When you first launch VMD, you get a simple protein. Launch my extension and it churns away, calculating conservation scores (dummy values for now) and the solvent accessible surface of the protein. The protein is then coloured based on the conservation scores, which you can see below for a sample protein.


After visualising the conservation scores, the final application will visualise its prediction of binding sites. The user will then be given the option of doing further analysis of the binding sites. We're currently considering two forms of analysis: select a residue and predict what a binding site containining this residue would look like; and select some residues and predict what the binding sites would look like if we excluded these residues from the predicted binding sites. Below is a sample of user-selected residues (in red).


Then the user can choose to visualise the solvent accessible surface. We calculate this surface using marching tetrahedra to extract an isosurface and kd-trees to calculate the isovalues. The surface is coloured by the conservation scores, just like in the previous shots. Currently I don't have residue selection working in this mode, although I plan on doing so. The meat of my computation will be using the conservation scores and the solvent accessible surface to predict the binding sites.


Then finally, VMD is a very feature-full tool and least of which you can do is rotate the protein for a view of the entire protein as you can see below. There is much more you can do with it, but I'll leave interested readers to explore themselves.


Next week I'm off to the Afrigraph Conference in Pretoria, after which I have to attend this 6 week bioinformatics course in Stellenbosch. Lectures 09:00-18:00 every day for 6 weeks. Not sure how I'm going to last.