Archive for June, 2009

Levels of Testing

June 24, 2009 Comments off

I’ve had reason recently to do some thinking on the various “levels” of software testing. I think there’s a rough hierarchy here, but there’s some debate about the naming and terminology in some cases. The general principals are pretty well accepted, however, and I’d like to list them here and expound on what I think each level is all about.

An important concern in each of these levels is to achieve as high a level of automation as possible, along with some mechanism to report to the developers (or other stakeholders, as required) when tests are failing, in a way that doesn’t require them to go somewhere and look at something. I’m a big fan of flashing red lights and loud sirens, myself 🙂

Unit testing is one of the most common, and yet in many ways, misunderstood levels of test. I’ve got a separate rant/discussion in the works about TDD (and BDD), but suffice it to say that unit-level testing is a fundamental of test-driven development.

A unit test should test one class (at most – perhaps only part of a class). All other dependencies should be either mocked or stubbed out. If you are using Spring to autowire classes into your test, it’s definitely not a unit test – it’s at least a functional or integration test. There should be no databases or external storage involved – all of those are external and superfluous to a single class that you’re trying to verify is doing the right thing.

Another reason to write comprehensive unit tests is that it’s the easiest place to fix a bug: there are fewer moving parts, and when a simple unit tests breaks it should be entirely clear what’s wrong and what needs to be fixed and how to fix it.

As you go up the stack to more and more complex levels of testing, it becomes harder and harder to tell what broke and how to fix it.

Generally unit tests for any given module are executed as part of every build before a developer checks in code – sometimes this will also include some functional tests as well, but it’s generally a bad idea for any higher-level tests to be run before each and every check-in (due to the impact on developer cycle-time). Instead, you let your CI server handle that, often on a scheduled basis.

Some people suggest that functional and integration are not two separate types, but I’m separating them here. The key differentiation is that a functional test will likely span a number of classes in a single module, but not involve more than one executable unit. It likely will involve a few classes from within a single classpath space (e.g. from within a single jar or such). In the Java world (or other JVM-hosted languages), this means that a functional test is contained within a single VM instance.

This level might include tests that involve a database layer with an in-memory database, such as hypersonic – but they don’t use an *external* service, like MySQL – that would be an integration test, which we explore next.

Generally in a functional test we are not concerned with the low-level sequence of method or function calls, like we might be in a unit test. Instead, we’re doing more “black box” testing at this level, making sure that when we pour in the right inputs we get the right outputs out, and that when we supply invalid input that an appropriate level of error handling occurs, again, all within a single executable chunk.

As soon as you have a test that requires more than one executable to be running in order to test, it’s an integration test of some sort. This includes all tests that verify API contracts between REST or SOAP services, for instance, or anything that talks to an out-of-process database (as then you’re testing the integration between your app and the database server).

Ideally, this level of test should verify *just* the integration, not repeat the functionality of the unit tests exhaustively, otherwise they are redundant and not DRY.

In other words, you should be checking that the one service connects to the other, makes valid requests and gets valid responses, not comprehensively testing the content of the request or response – that’s what the unit and functional tests are for.

An example of an integration test is one where you fire up a copy of your application with an actual database engine and verify that the operation of your persistence layer is as expected, or where you start the client and server of your REST service and ensure that they exchange messages the way you wanted.

Acceptance tests often take the same form as a functional or integration test, but the author and audience are usually different: in this case an acceptance test should be authored by the story originator (the customer proxy, sometimes a business analyst), and should represent a narrative sequence of exercising various application functionality.

They are again not exhaustive in the way that unit tests attempt to be in that they don’t necessarily need to exercise all of the code, just the code required to support the narrative defined by a series of stories.

Fitnesse, Specs, Easyb, RSpec and Green Pepper are all tools designed to assist with this kind of testing.

If your application or service is designed to be used by more than one client or user, then it should be tested for concurrency. This is a test that simulates simultaneous concurrent load over a short period of time, and ensures that the replies from the service remain successful under that load.

For a concurrency test, we might verify just that the response contains some valid information, and not an error, as opposed to validating every element of the response as being correct (as this would again be an overlap with other layers of testing, and hence be redundant).

Performance, not to be confused with load and scalability, is a timing-based test. This is where you load your application (either with concurrent or sequential requests, depending on it’s intended purpose) with requests, and ensure that the requests receive a response within a specified time frame (often for interactive apps a rule is the “two second rule”, as it’s thought that users will tolerate a delay up to that level).

It’s important that performance tests be run singly and on an isolated system under a known load, or you will never get consistency from them.

Performance can be measured at various levels, but is most commonly checked at the integration or functional levels.

A close relative of, but not identical with concurrency tests are load and/or scalability tests. This is where you (automatically) pound on the app under a simulated user (or client) load, ideally more than it will experience in production, and make sure that it does not break. At this point you’re not concerned with how slow it goes, only that it doesn’t break – e.g. that you *can* scale, not that you can scale linearly or on any other performance curve.

Quality Assurance
Many Agile and Lean teams eschew a formal quality assurance group, and the testing such a group does, in favor of the concept of “built in” QA. Quality assurance, however, goes far beyond determining if the software perform as expected. I have a detailed post in the works that talks about how else we can measure the quality of the software we produce, as it’s a topic unto itself.

Alpha/Beta deployments
Not strictly testing at all, the deployment of alpha or beta versions of an application nonetheless relates to testing, even though it is far less formalized and rigorous than mechanized testing.

This is a good place to collect more subjective measures such as usability and perceived responsiveness.

Manual Tests
The bane of every agile project, manual tests should be avoided like the undying plague, IMO. Even the most obscure user interface has an automated tool for scripting the testing of the actual user experience – if nothing else, you should be recording any manual tests with such a tool, so that when it’s done you can “reply” the test without further manual interaction.

At each level of testing here, remember, have confidence in your tests and keep it DRY. Don’t test the same thing over and over again on purpose, let the natural overlap between the layers catch problems at the appropriate level, and when you find a problem, drive the test for it down as low as possible on this stack – ideally right back to the unit test level.

If all of your unit test are perfect and passing, you’d never see a failing test at any of the other levels, theoretically. I’ve never seen that kind of testing nirvana achieved entirely, but I’ve seen projects come close – and those were projects with a defect rate so low it was considered practically unattainable by other teams, yet it was done at a cost that was entirely reasonable.

By: Mike Nash

If I Shorten the Race Can I Sprint Faster?

June 19, 2009 Comments off

When Point2 adopted Scrum over a year ago we started with 30 day iterations. The process was new to us and it seemed like a reasonable starting point. As we ironed out our practices and got more experience we naturally started to make changes to improve how we did things.

After running through a few 30 day Sprints we soon realized that they were just too long. We were finding that at the end of the Sprint we were unable to remember what we did at its start. This became glaringly apparent during retrospectives. The decision was made to move to two week Sprints, and that is they way our six teams have been operating ever since.

My team had been throwing around the idea of shortening our Sprints even further over the last few months, but at our last Retrospective we decided to actually take the plunge. Even though we were being reasonably successful with two week iterations, we still felt that we could go faster. One week (5 day) iterations seemed like a good way to help take the team forward.  Here’s how we think the change will affect the team.

  • Even though two weeks was not long, it still left enough room for a large amount of stories to be part of the Sprint. This meant we had a broad focus.  One week iterations would allow us to easily visualize the Sprint. Our focus would be much sharper
  • One week Sprints would hopefully allow us to better accommodate change that comes in an Agile environment.
  • With weekly retrospectives would come more organized discussion to fine tune our process even further (that’s not to say we would not continue to make mid-Sprint changes if need be).

The team plans to make the transition at the start of the next Sprint. We understand there may be a bit of extra overhead in regards to planning and estimation meetings, but feel that these will get considerably shorter, having less impact on the actual Sprint.

Without actually having tried it yet the team just has a “gut” feeling that this will make us go faster. Just think about it for a second. If you are in a 400 metre race, you will have to run at a slower pace in order to finish. Cut that race down to a 100 metres and you should be able go to all out. I look forward to seeing how this will impact the team and will do follow-up posts reporting on our progress.

By Hemant J. Naidu

Vim Tip – Registers and Macros

June 19, 2009 1 comment

Let’s say you’re a beginning vim user. You know how to yank, put, substitute and change. You know how to do forward searches within a line or do searches ’til’ a certain character; you can use /, ?, *, # effectively. You can combine your searches with commands to work some serious magic. You’ve seen the light. Vim is your One True Editor.

So what’s next? If you’ve mastered the basics, registers and macros are a good place to go turn pro. I’ll briefly explain both concepts, and then use an example to show how these features complement each other.


A register is like a variable in a typical programming language. It’s a location in memory where you can store something. We’ll look at two usages for registers: storing text, and storing macros.

Normally when you yank a line with yy, it’s contents get saved into the default, unnamed register, called “” (quote-quote). If you want to yank something into another named register, say register a, you have to prefix your yank with “{register}. The same goes for puts.

So for example…

yy    yank the current line into the unnamed register
"ayy    yank the current line into register a
"ap     put the contents of register a

Simple right? At first it might seem that this is only marginally useful, but registers can greatly simplify some macros. So we’ll talk about that next.


The q{register} command starts recording a macro. While you’re recording a macro, press q again to signal that you’re finished.

I like to use the q, w, e, and r registers for macros, simply because they’re easiest to reach. So for example, to start recording a macro, I’ll press qw, record my actions, and press q to signal the end of the macro.

The @{register} command executes a macro. So I can execute my new macro with @w. Usually once I know that it works, I’ll also have to run it many times with something like 100@w.

Tying it together

This came up a few weeks ago when I was generating some characterization tests. I had written a quick script to generate some test data in CSV; in reality it was a bit more involved than what follows, but this gets the point accross:

Input, Output
34523451, true
65434092, true
45810353, false
(~1000 more of these)

I had also written a single test at the bottom of the file, and I wanted a thousand more that all followed the template.

public void Test() {
    Assert.AreEqual(IsNumberValid(INPUT), OUTPUT);

So I used a vim macro — the vim keystrokes are in brackets () after each step:

  1. Cut the test template into register a (/\[Test<Enter>”a4dd)
  2. Move to the beginning of the file, start the macro (ggqq)
  3. Copy 34523451 into register b (“byw)
  4. Copy true into register c (f l”cyw)
  5. Delete the current line (dd)
  6. Put a copy of the test template at the end of the file (G”ap)
  7. Replace INPUT with the contents of register b (/INPUT, dw, “bP)
  8. Replace OUTPUT with the contents of register c (/OUTPUT, dw, “cP)
  9. Move the cursor back to the beginning of the file (gg)
  10. End the macro (q)
  11. Execute the macro 1000 times (1000@q)

Using the above, I was able to generate a ton of code in a minute or so. If I had tried to cobble together a script to do the same it would have taken at least several times as long. I use vim macros almost daily. I don’t often have to combine them with registers, but I’ve found it to be an invaluable technique in certain situations.

By Kevin Baribeau

Day Three at BSCE

June 11, 2009 Comments off

Another day eventful day has come to a close here in Vegas. For those that have missed out I’ll give you a summary of a day here at the Better Software Conference & Expo.

The morning begins at 7:30 with a light breakfast that one of the lovely hosts refers to as, “crumbs and juice.” Quite good and they had my favourite, blackberries! Then time enough to go to the wifi lounge to check email.

The day’s festivities begin with a keynote address by Tim Lister discussing Some Not-So-Crazy Ways to Do More with Less. A thoughtful dissertation about learning to make do has the tendency to create some innovative solutions to problems. Disruption can cause turmoil, but in the end we often tend to be better off – check out the Satir Change Model.

My first lecture of the day was a provocative discussion In Defense of Waterfall by Ken Katz. This was a very lively debate where Ken was actually a proponent of the Agile Manifesto but warned that there is no panacea. It is up to us to find what works best, and then improve upon it; heh, sounds very Lean to me:)

By this point in the day it is time for lunch and networking. Lunch is a fantastic Mexican-ish buffet. In the same space is a small exposition so an hour for lunch is nowhere near enough time to talk to our peers and all of the vendors. In fact, I lost track of time and missed my next presentation, oops, oh well I had a great chat with the folks from ThoughtWorks.

I then ran off to the Agile PMP: Teaching an Old Dog New Tricks. I’ve done some courses based on the PMBOK but never carried through completion on attaining the PMP certification – I wasn’t sure that “best practises” would remain relevent in the software world. Michael Cottmeyer did a good job of showing how an AgilePMP is relevent.

The highlight of my day was learning Andy Kaufman’s Dirty Little Secret of Business. You want the secret, too? Relationships, nothing scandalous (what happens in Vegas, does not stay in Vegas), just that building relationships is extremely important – even for those of us who would prefer to spend quality time with our favourite Mac.

17:30, the day is done and I am drained… but wait, there’s more. There’s a reception, I am tired but the talk of free snacks and beer lures me in. And it gives me a chance to talk to Andy Kaufman a bit (BTW, I did not talk to Latka, I didn’t drink that much). Thanks for the pep talk, Andy. All right, now I’m ready to network.

As for Dave; I haven’t seen him since he entered into the high stakes poker tournement.  I guess he’s networking, too.

By: Kevin Bitinsky

How to Mask Command Line Output from the Python run() Method

June 10, 2009 Comments off

Recently, We were tasked with creating automated deployment for our Python Django project. For our purposes, this involved creating python modules for programatically logging onto servers and carrying out all the necessary deployment tasks. We decided to use Fabric to make this work. We encountered difficulty, however, when using the run() method to execute commands in remote environments.

The problem is, with every command that Fabric executes on the remote machine, the command itself is always echoed to standard-out when the run() method executes, displaying (in plain text!) any password arguments you may be passing to the command being executed.

We worked around this by “monkey-patching” stdout itself with a wrapper class that checks specifically for the password provided, and replacing any output to standard out matching the password with something else (like “****”).

The wrapper class looks something like this:

class StdoutWrapper:

  def __init__(self, stdout, mask):
    self.stdout = stdout
    self.mask = mask

  def __getattr__(self, attr_name):
    return getattr(self.stdout, attr_name)

  def write(self, text):
    if text.find(self.mask) > -1:
      self.stdout.write(text.replace(self.mask, '****'))

Then, when you want to execute the command whose output you wish to mask, you simply use the StdoutWrapper class like so:

import getpass.getpass, sys

def some_fabric_method():
  config.username = prompt('Enter Username:')
  config.passwd = getpass('Enter Password:')
  config.app_path = '/path/to/your/app'
  config.svn_url = 'http://some.svn.checkout/'
  out = StdoutWrapper(sys.stdout, config.passwd)
  sys.stdout = out
    run("svn revert -R $(app_path")
    run("svn switch --username $(username) --password $(svn_password) $(svn_url) $(app_path)")
    sys.stdout = sys.__stdout__

Obviously, this is a less-than-ideal solution; it accomplishes what we want, but having to resort to temporarily monkey-patching standard out itself seems like overkill. Has anyone else out there found a more preferable solution for suppressing plain-text password output while using Fabric for Python?

By: Brett McClelland

Risk and Testing in Las Vegas

June 9, 2009 1 comment

So, here I am in my hotel room, at the end of the second exhausting day of the Better Software Conference and Expo, in Las Vegas, Nevada.

Where to begin? The plane trip was good, the cab ride was terrifying, the Strip is unlike any place I’ve ever been. If you want more details about Vegas, send me a note, I’ll give you my full opinion. But let’s talk about software.

My first Tutorial was Ken Collier’s talk on Agile Modeling. Really useful stuff. I didn’t know exactly what I was jumping into here, so it was good to see that there was dialogue on a level I could participate in fully. Ken spoke well about the core principles of Agile Modeling, including the need to ‘travel light’ – no model will meet every need, so make the model that meets the particular needs of that stage in development. He went on to discuss domain modeling, architectural modeling, and usability modeling. He also gave a really useful run-through of use cases, use case personas, scenarios and user stories.

After that was Jeff Patton’s talk on User Story Mapping – building a better backlog. Jeff’s talk was good also – covering some of the same material as Ken did around user stories. One of the refreshing things about a conference like this is that you get to hear how different groups are using the same Agile methodologies. Or put another way, over dinner last night (a quiet little spot near the hotel that had slot machines right in the restaurant, imagine!) I remarked to Kevin that I’d heard two different definitions of a user story in the course of a single day.

One of Jeff’s excellent points was writing out a use case scenario and then making stories based on every verb in your scenario. This is a point that I as a Business Analyst sometimes miss – that we’re not trying to enable our users to be something, we’re enabling them to do things. This is one of those little shifts in approach that I know will make my work a lot easier when I get back home.

Today started with Julie Gardiner’s talk on risk-based testing. Immediately she explored how risk management involves three steps – identification, analysis and mitigation. She then walked us through four different approaches to risk management – the TMap approach, heuristic management, the STEP approach and an approach pioneered by Paul Gerrard. The latter was particularily enlightening, as it includes a means of factoring in existing testing capability as a means of prioritization. Definitely lots to take away.

As it was yesterday, lunch at the Venetian Hotel was a subdued and somewhat spartan affair. This place puts itself up as ‘The Italian Vegas Experience’ and yet, there was only one course of appetizers. Amateurs. You know what they say though – the problem with Italian food is that four or five days later, you’re hungry again. I couldn’t help but feel sorry, however, that I had to miss the Point2 Tuesday lunch meeting. Nobody back in Saskatoon ever gets after me for using the wrong fork. I guess I’m just homesick.

The high point of today, however, was Dan North’s presentation on Behaviour Driven Development. Really enlightening. A couple of points to consider – One is the simple point that we use all of these construction metaphors – builds, architecture et cetera – for an industry that’s inherently different and much, much younger. He summed it up with ‘The thing about software is, it’s soft. ‘ With that, he went on a fantastic rant (Now I know where Aidan gets it!) about moving towards a stakeholder-centric process that ensures just enough work to get just the right results.

…and yet another definition of what makes up a user story. As a conference participant I need a consistent definition of a user story so that I’ll know what the heck to put in my Jiras when I get home!

Well, I’m off. The sun has set on the Strip as I write this. If you’re in Vegas, look us up – we’re at the Imperial Palace. Kevin’s the one with the cane and the feather boa at the craps table. I’m the one writing the blog posts.


By: Dave Kellow

Why don’t people like my ideas?!

June 9, 2009 1 comment

frustration1 Have you found yourself asking that question?  I’m sure many developers have, although this topic applies universally to anybody in any profession (or personal life for that matter).

By: Chris Dagenais