If you’re a gamer (or at least have met one), you will know that, in order for a computer to run a fairly recent game, it needs a little more then just a [central] processing unit (aka CPU): it also needs a graphics processing unit (aka GPU). This is because these games require an incredible amount of computing, and the elements to be computed are very special: arrays. Arrays are kinda like matrices (or other multi-dimensional structures), but they are optimized in a way that it is easier to make calculations on them.
One example of array computing is image processing. Digital images are composed of pixels, and each pixel stores a specific value. An image editing software (such as Adobe Photoshop or GIMP) is capable of manipulating these values for each pixel.
One of the biggest advances in modern computing is something called parallelism. With that, it is possible to make a processing unit work on various elements of an array at the same time, in parallel, hence the name. Each element is processed as a thread (that’s why parallelization is sometimes called threading). You might remember that a few years ago, dual core processors started being used on personal computers, and it was a rage. These CPUs could parallelize jobs to the two cores and were crazy fast for their time. Today, high-end machines can have a CPU with something like 12 or 16 cores.
On the other hand, GPUs (also called video cards or graphic cards) can work on many, many more threads than CPUs, because they are optimized for that type of job (dealing with arrays). In fact, the NVIDIA graphic card I have on my notebook has 2 multiprocessors with 192 cores each, and it can work with 2048 threads per multiprocessor. It can actually run games just a little better than a PlayStation 3.
But besides games and image/video processing, there are many other applications that use array-type structures, including scientific ones. For instance, astronomers constantly work with arrays of data coming from observations, signal processing, multiple-dimensional simulations and so on. Is it possible to make GPUs work for science?
The answer is yes. In fact, NVIDIA has developed a programming environment especially for end-users to be able to harness the power of their graphic cards. It is called CUDA. So it’s kinda like a new programming language, but not quite. CUDA revolves around using very specific lines of code, and they work as substitutes to normal C code blocks, almost like transplanting a (sometimes) better and more efficient block than the original one. Since it’s a “strange body” inside the C code, the user has to be careful in making the original code able to talk with CUDA. This might not seem like a big deal for programmers who’ve always been working with C, but it’s very weird to, e.g., Python users. And here comes some good and bad (?) news.
The first good news is that it is possible to make use of CUDA programming in Python codes, and that is a huge thing for scientists, because there are many of us who use this language. There is an open-source tool for that: PyCUDA. It works as a library that enables Python codes to be able to talk with the CUDA software and hardware in your machine. I’ve been able to install and run the sample codes that come with PyCUDA in my notebook. However, and this is the bad news, attaching a CUDA block is not as easy and bureaucracy-free as the Python codes we are used to. Because it is very much like C, CUDA codes inherit all the caveats from that language, which include dealing variable declaration, memory allocation and the dreaded pointers. PyCUDA includes a bunch of neat “shortcut” functions that can make everything easier on Python programmers, such as multiplying arrays, but in order to do more complicated calculations, it is probably necessary to know your C and CUDA languages.
I am a bit hung up on that part. I’ve been re-studying a bit of C in my free time and when the hot weathers gives in a little bit so I can work on my PC without quickly wearing out (no air-conditioning here). I’ve also been reading a bit about CUDA coding, so I guess it will take me some time to be able to use PyCUDA as I really intended to.
And here comes the second good news: maybe there is no need for all that. Some really clever people have worked on a tool that can help Python users to harness the power of their GPUs without the need to deal with C and most of CUDA coding. It’s Continuum Analytics’ Anaconda. The way it works is by attaching decorators to Python codes and the software behind Anaconda does most of the work of translating that to CUDA and sending the code to be processed on the GPU. There are some tutorials on YouTube about this tool, and I think it’s really handy. But, as you might have guessed, there is also a bit of bad news: Anaconda is not free. However, students and people affiliated to education institutes are allowed to have a free license of Anaconda, with all the features that it has. And that is really, really cool on their part. I have downloaded and installed Anaconda on my notebook, with no big hassles, but I haven’t tried using it in my codes yet.
Speaking about hassles, I have to say, installing CUDA on a Linux machine is probably not one of the easiest things. The installer from NVIDIA’s website will, by default, install the latest driver for your video card, and we all know that dealing with official NVIDIA drivers on Linux can be a huge pain in the ass, depending in what is your GPU and your distro, and may actually become downright frustrating. Ugh. But I think it pays off. Actually, for me, the pay off for using the latest driver is great. The previous driver I was using, version 331.113, which is currently listed on Linux Mint’s driver manager, didn’t work optimally in accelerating the graphics on my notebook. But when I installed the driver version 340.29 that came with CUDA, everything worked like a charm. Now I have a pretty satisfying acceleration and games play beautifully on Steam. And even so, power consumption is still pretty much the same (but, of course, when I want to use a lot of graphics acceleration, I just plug it in).
So, there you go, you have my take on CUDA computing for now. I still want to talk about how some Python codes improve by running them on the GPU, and comparing them with the CPU counterparts, but I will leave that for another post. I plan on putting up some GPU accelerated pieces of code into the public so anyone can also try and see how it plays on their video cards. I could try it either with PyCUDA and Anaconda, but the second option will only work for people who have a license, and that is kind of a bummer. It is difficult to produce open-source codes if part of them depend on licensed software. Actually, could it even be called open-source in that case?
Featured image: Sapphire ATI Radeon HD 4550 GPU, by William Hook on Flickr