Archive

Posts Tagged ‘Django’

Django’s assertRedirects little gotcha

April 23, 2010 1 comment

Something we’ve be trying to pay more attention to with our newest green field development projects is the running time of our unit test suites.  One of the projects was running ~200 unit tests in 2 seconds.  As development continued and the test case number grew, it started taking 10 seconds, then over 30 seconds. Something wasn’t right.

First challenge was to determine which were the slow running tests.  A little googling found this useful patch to the python code base.  Since we are using python 2.5 and virtual environments we decided to simply monkey patch it.  This makes verbosity level 2 spit out the run time for each test.  We then went one step further and made the following code change to _TextTestResult.addSuccess:

    def addSuccess(self, test):
        TestResult.addSuccess(self, test)
        if self.runTime > 0.1:
            self.stream.writeln("\nWarning: %s runs slow [%.3fs]" % (self.getDescription(test), self.runTime))
        if self.showAll:
            self.stream.writeln("[%.3fs] ok" % (self.runTime))
        elif self.dots:
            self.stream.write('.')

With it now easy to tell which were our slow tests we set out to make them all fast again. As expected the majority of the cases were an external service not being mocked correctly. Most of these were easily solved. But there were a few tests where we couldn’t find what hadn’t been mocked. Adding a few timing statements within these tests revealed the culprit. The Django frameworks assertRedirects method.

    def assertRedirects(self, response, expected_url, status_code=302,
                        target_status_code=200, host=None):
        """Asserts that a response redirected to a specific URL, and that the
        redirect URL can be loaded.

        Note that assertRedirects won't work for external links since it uses
        TestClient to do a request.
        """
        if hasattr(response, 'redirect_chain'):
            # The request was a followed redirect
            self.failUnless(len(response.redirect_chain) > 0,
                ("Response didn't redirect as expected: Response code was %d"
                " (expected %d)" % (response.status_code, status_code)))

            self.assertEqual(response.redirect_chain[0][1], status_code,
                ("Initial response didn't redirect as expected: Response code was %d"
                 " (expected %d)" % (response.redirect_chain[0][1], status_code)))

            url, status_code = response.redirect_chain[-1]

            self.assertEqual(response.status_code, target_status_code,
                ("Response didn't redirect as expected: Final Response code was %d"
                " (expected %d)" % (response.status_code, target_status_code)))

        else:
            # Not a followed redirect
            self.assertEqual(response.status_code, status_code,
                ("Response didn't redirect as expected: Response code was %d"
                 " (expected %d)" % (response.status_code, status_code)))

            url = response['Location']
            scheme, netloc, path, query, fragment = urlsplit(url)

            redirect_response = response.client.get(path, QueryDict(query))

            # Get the redirection page, using the same client that was used
            # to obtain the original response.
            self.assertEqual(redirect_response.status_code, target_status_code,
                ("Couldn't retrieve redirection page '%s': response code was %d"
                 " (expected %d)") %
                     (path, redirect_response.status_code, target_status_code))

        e_scheme, e_netloc, e_path, e_query, e_fragment = urlsplit(expected_url)
        if not (e_scheme or e_netloc):
            expected_url = urlunsplit(('http', host or 'testserver', e_path,
                e_query, e_fragment))

        self.assertEqual(url, expected_url,
            "Response redirected to '%s', expected '%s'" % (url, expected_url))

You’ll notice that if your get request uses the follow=False option you’ll end up at line 34 in this code snippet which will kindly check to make sure the page you are redirecting to returns a 200. Which is great, unless you don’t have the correct mocks for that page setup too. Mocking out the content for a page you aren’t actually testing also didn’t seem quite right. We didn’t care about the other page loading, it had it’s own test cases. We just wanted to make sure the page under test was redirecting to where we expected. Simple solution, write our own assertRedirects method.

def assertRedirectsNoFollow(self, response, expected_url):
    self.assertEqual(response._headers['location'], ('Location', settings.TESTSERVER + expected_url))
    self.assertEqual(response.status_code, 302)

Back to a 2 second unit test run time and all is right with the world again.

Dustin Bartlett

PyCon 2010 Atlanta Event Report

March 16, 2010 Comments off
pycon 2010

pycon 2010

I got back from PyCon 2010 Atlanta for a while, I am still absorbing the huge volume of knowledge and information I gathered during the conference. It was an amazing experience for me to see at first hand what the Python community were doing. There were something for everyone in PyCon 2010, from beginner Python users to advanced python users. The tone of the conference was very friendly, it was a totally difference experience from corporate sponsored technology conferences.

During the two tutorial days prior to the conference days, I attended four tutorials:
Faster Python Programs through Optimization – The tutorial presented the guidelines and strategies of Python program optimization. It demonstrated the techniques on measuring speed(test.pystone), profiling  CPU usage(cProfile), and profiling Memory usage(Guppy_PE framework), which I never knew before.  The tutorial detailed the essential differences among Python built-in data types in terms of performance, which is also helpful for me.
Pinax Long TutorialPinax is an open-source platform built on the Django Web Framework. In the tutorial, the Pinax core developers presents on Pinax installation, creating projects,  leveraging Django resuable applications, modification of templates, Pinax specific settings, media handling, deployment. The most impressive part of the tutorial for me are how Pinax takes advantage of virtualenv and pip, both provided by Ian Bicking to streamline the installation process, and how Pinax  leverages the resuable Django applications. The tutorial also exposed lots of open source reusable Django applications that worth looking at in the coming days, to name a few, django-frontendadmin, django-flatblocks, django-ajax-validation, django-openid, and django-pagination.
Django in Depth – Django is one of my favourite topics during the conference. “In this tutorial, we’ll take a detailed look under the hood, covering everything from the guts of the ORM to the innards of the template system to how the admin interface really works”. James Bennett led us dive deep into the internal world of Django Web Framework, showed us the bits and pieces of Django’s ORM, Forms and Validation, Template system, request processing, view, and admin interface, which are far beyond the Django documentation covers.
Django Deployment Workshop – Another tutorial on Django presented by Jacob Kaplan-Moss covers  the creation of a full Django deployment environment running on a cluster of (virtual) machines. Jacob Kaplan-Moss walked us through a live demo on how to setup a production ready deployment environment on the cloud(Rackspace,  Amazon EC2) by removing the single point of failures one by one.

During the following three conference days, I attended lots of talks, here are the highlights of the topics that attracted me the most:
NoSQL Database was a hot topic during the conference, MongoDB, Cassandra, and Neo4j attracted lots of attentions. Mark Ramm and  Rick Copeland from SourceForge.net presents the comparison between Relational DB and NonSQL Database, the practical guide on deciding what to use in projects, and how they quickly migrating one of their high traffic website from PHP to Python by using TurboGears, MongoDB, and Jinja templates. It is a live example to demonstrate how NoSQL Database could be used in real world projects.

Test in Python was another topic I followed closely in the conference: Ned Batchelder for the fame of coverage.py,   gave a talk on test and testability, there was not too much new for me, as we emphasize TDD in the daily software development here in Point2. Michael J Foord, creator of the famous Python Mock library, gave a talk on New *and* Improved: Coming changes to unittest, the standard library test framework, which covered lots of great stuff coming to Python unittest, the most attractive bits are test discovery and more convenient assertion methods. By the way, you do not need to wait for upgrading your Python to 2.7 or 3.2 to take advantages of the new unittest library features, it was back ported to Python 2.4+. It is great to see that test picking up speed in Python community.

Using Django in Non-Standard Ways given by Eric Florenzano was an interesting topic, Eric covered how to use Django with alternatives to what Django offers and how to using bits Django offers in other contexts. He gave examples on  using Jinja2 template engine with Django, not using django.contrib.auth in Django application, not using ORM in Django, using Django’s ORM stand-alone. It is amazing to see how you can take advantages of the Django framework, even not in the standard way.

Overall, my experience of PyCon 2010 Atlanta was amazing.  The conference was great fun and informative.  It was great to be there – PyCon 2010 Atlanta.

Stubbing a Test Mail Server

February 17, 2009 1 comment

Django has built in support for sending email. We make use of this in our app, but when testing, we wanted to be able to access the emails sent so we could assert on their content, and pull data out of the body. In the unit tests, that’s easily solved by mocking the email client call, but we wanted to do this as a black-box regression test. That’s where the Python built in smtpd.SMTPServer comes in handy:

import smtpd, asyncore, threading

email_server = None

class FakeServer(smtpd.SMTPServer):

def __init__(self, localaddr, remoteaddr):
self.server = smtpd.SMTPServer.__init__(self, localaddr, remoteaddr)
self.emails = {}

def process_message(self, peer, mailfrom, rcpttos, data):
for recipient in rcpttos:
idx = recipient.replace(‘@’,’_at_’).replace(‘.’,’_dot_’)
existing_emails = emails.get(idx, [])
existing_emails.append(data)
emails[idx] = existing_emails

def stop(self):
self.server.close()

def main()
email_server = FakeServer((‘localhost’, 10920), None)
def start_email():
asyncore.loop()
thread = threading.Thread(target=start_email)
thread.setDaemon(True)
thread.start()

if __name__==’__main__’:
main()

This starts an email server on port 10910, which on receipt of a new email, stores it in a dictionary keyed by a modified version of the email address. Multiple emails to the same address get appended to the list. There is no reason in this context for replacing the ‘.’ and ‘@’ symbols from the email address in this context, but it was appropriate for our usage.

We run the stub mailserver on a separate box, and make the emails available via a get request on a test HTTP server, which allows us to run tests scripted entirely written in Selenium. I will blog again with details of how we manage our test HTTP server. You can use the this code as the basis for embedding the mail server in a unittest setup too.

By: Chris Tarttelin

Creating Fixtures from Within Tests

February 3, 2009 Comments off

Django gives you a ‘dumpdata’ target which will create a fixture from all the records in your schema. For what we wanted, this was overkill. We had an existing unit test which was creating data in one test, and it was just the right amount of data for what we wanted. After searching through the Django codebase, it became clear that we could pass the objects we had in that test straight to the JSON serializer, and write the output to file. This ended up looking something like this:-

from django.core.serializers import json
serializer = json.Serializer()
objectsToSerialize =
MyModel.objects.filter(column='restriction')
with open('my_fixture.json','w') as f:
. f.write(serializer.serialize(objectsToSerialize, indent=4))

And that’s it! You can also spin up a shell, using:-

python manage.py shell

and load the data you want to create fixtures from if you don’t have any tests that create the data you want already.

By: Chris Tarttelin

Web Application Scalability

February 2, 2009 Comments off

For most small applications, scalability is usually not something that recieves much consideration. For applications that have potential to grow to tens of thousands of users and up, however; scalability may eventually become a concern.

Many web application success stories owe their scalability to the fact that they are written in Python/Django. It is extremely light-weight, as web application architectures go, and allows for much flexibility. Some of the things that we’ve kept in mind while coding that help with scaling are:

• Minimize external dependencies
• Replace/refactor/migrate components and modules as they become problematic (Python components help to streamline this process)
Emphasize ‘low coupling’ of code bases
• Attempt to localize and modularize failures (try and prevent them from spilling into other modules/applications)
• Limit the number of queries within loops (object.get(), object.filter(), etc.)
• It is much more efficient to fetch all records necessary in one query, and work with the retrieved dataset within a loop, rather than querying once per loop.
• Get rid of all obviously unnecessary leaf services.
• Customize reliable open-source software – bend it to your will.
• ‘psyco’ compiler – specialized Python compiler – extremely optimized
• Processor-heavy functions, or highly-executed functions, can be ‘psyco-ized’
• This compiler is something we have not yet tried using, however some development companies report as much as a 400% performance boost by using it properly.

A few rabbit-holes that developers should avoid running down if at all possible:

• Always be aware of the difference between “fast” and “fast enough”
• Python/Django has many scalability optimizations built right in; implement your functionality and test it under appropriate load-testing environments first, before spending too much time optimizing manually.
• Strive for hardware efficiency, but do not obsess over it
• Do not make the assumption that a technique for code optimization for one language will work for the language you are working in.
• Keep in mind that eventually you will have ‘no cards left to play’.

By: Brett McClelland