Sunday, February 27, 2011

Some Python3 fun

I wasn't feeling very productive today so decided to take a look at Python 3.2 and specifically understand how hard would it be to create a development environment not affecting Python 2.6 related stuff and also how much time would it take to port a simple 2.x script to 3.x.

Luckily, Python 3.x is available in FreeBSD ports tree and, moreover, it doesn't conflict with python2.x stuff, so all I had to do is to cd /usr/ports/lang/python32 && sudo make install clean (actually I've spotted a minor problem with the port, but it doesn't affect anything, for me at least). Viola, it's already there.

Then I decided to pick some simple script and see how much effort it will take to make it run with 3.2. A script that I created some time ago to parse urbandictionary's output seemed like a good fit for this task.

I've ran 2to3 over my old script and it generated a patch which I successfully applied. Apart from that I had to fix bytes/string problem (not really hard and there are lots of information on it on the net) and my script just worked!

So it looks like things are not very hard with 2-to-3 migration. Obviously, things are harder for the larger projects, but with larger projects everything is harder anyway.

The only thing I haven't figured out yet is how to have different versions of easy_install for both 2.6 and 3.2 on FreeBSD, though I haven't spent much time digging into it.

PS I went further with the urbandictionary script, created a package for it, uploaded on PyPI, improved formatting and made it easier to use from python. Here's a screenshot:



Check it out on github if you're interested.

Thursday, February 24, 2011

ChangeLog / NEWS importance

I was scanning portscout recently and noticed new version of xplanet released.

As usual, I've grabbed new version tarball and started compilation, meanwhile checking ChangeLog and NEWS for list of changes in the new version. I was a bit disappointed when I wasn't able to find anything I was looking for. I thought I probably could find some announcement on the forums or project's website, but had no luck with this as well. Now it's not even funny.

Of course, I can do diff -ruN against the old version, but it's not quite easy to understand what's been changed without knowing context and it won't work for all users. So, it's quite strange situation for a typical user as he has no clue why he should upgrade.

While providing a list of changes for the release is a very simple thing, it's very important from user perspective and unfortunately keeps getting forgotten sometimes.

PS I almost forgot to mention I've tried to check project's svn, but sourceforge is slow as hell or even slower, so I wasn't able neither checkout sources nor view them from the website.

Saturday, February 12, 2011

Argument-based methods synchronization in Python

I've been implementing yet another client for a REST-like service and I've spotted a problem I typically haven't faced while working with REST services. I needed to modify some resource -- meaning issuing a PUT request on resource /api/resource/id -- for which changes didn't apply immediately but it rather went into 'building' state or something like that. I had a method like modify_bar(self, resource_id) and things worked great for a single thread, but caused problems with multiple active threads. Obviously, I checked that resource_id is ready for operations, but in the middle of the end of the check and starting of actual work another thread could get in and cause troubles.

Let's check sample code:



And its output:


(14:46) novel@nov-testing2:~/sync %> ./sync.py
working on bar id = 1
working on bar id = 1
working on bar id = 1
done with bar id = 1
done with bar id = 1
done with bar id = 1
(14:47) novel@nov-testing2:~/sync %>


But the result I wanted to achieve:


(14:46) novel@nov-testing2:~/sync %> ./sync.py
working on bar id = 1
done with bar id = 1
working on bar id = 1
done with bar id = 1
working on bar id = 1
done with bar id = 1
(14:47) novel@nov-testing2:~/sync %>


I've been thinking how to do it without class redesign and API changes and after some time came up with a synchronization solution for methods based on their arguments.

Implementation looks like this:



Basically, we keep a list of ids of objects we're working with. When we want to start working with some object we check the list first, if its id in the list it means it's currently being worked on and we have to wait, otherwise we push the id to the list and start working.

And the result is:


(15:09) novel@nov-testing2:~/sync %> ./sync2.py
working on bar id = 1
done with bar id = 1
working on bar id = 1
done with bar id = 1
working on bar id = 1
done with bar id = 1
(15:10) novel@nov-testing2:~/sync %>


For the sake of experiment let's start another thread with different bar_id value to make sure it's not blocked:



And the output now:


(15:13) novel@nov-testing2:~/sync %> ./sync2.py
working on bar id = 1
working on bar id = 0
done with bar id = 1
working on bar id = 1
done with bar id = 0
done with bar id = 1
working on bar id = 1
done with bar id = 1
working on bar id = 1
done with bar id = 1
(15:14) novel@nov-testing2:~/sync %>


As you can see, things are now working as expected.

Friday, February 4, 2011

Mocking libcloud using Mox

If you're using libcloud to interact with cloud providers, you probably would like to cover this code with tests if you haven't done so already.

However, most likely you don't want libcloud to do any real calls to the cloud provider API service as it could create additional noise on your account, add extra load and cost you money (esp. if tests are being run on a regular basis on a Continues Integration server).

One possible solution is to mock these things up using Mox mocking framework. Libcloud uses 2-step initialisation process for a connection class -- first you obtain driver class and then initiate it -- and I seem to always forgot how to mock it right, so decided to write it down.

For example, we have a simple class that returns a list of names of currently available nodes:



And we want to test it without making actual API calls. Let's see how the test code will look for this task. For the sake of simplicity default unittest framework will be used.



Basically, what we are doing here in the test:


  • Create a mock for RackspaceNodeDriver class and record list_nodes() call for it; we also make it return two node objects. Actually, we create fake objects using type(), so we don't have to bother with creation of real Node objects which need a way more properties to be set when constructing that we don't care about

  • Use mox's StubOutWithMock routine (please check mox docs for details on that) to stub out libcloud.providers module. Here, in get_driver() we pass lambda because get_driver() returns class itself, not an instance of the class, and using lambda here seems to be more clear than recording __call__() for a class

  • All the standard ReplayAll/VerifyAll mox routines


For the sake of experiment, you can replace actual node_names() method of the Example class with just return ["node1", "node2"]. The assertEqual() will pass, but mox will complain that list_nodes() wasn't called and test will fail.