Where to put Unit Tests in .NET

When you write a .NET application that has (for sake of argument) several DLLs, you have various options about where you compile your NUnit Unit Test Fixtures to. They are:

1. Create one ‘testing and production’ DLL that contains all the classes in your application (production and test fixture), and run NUnit against just that.

2. Put the Test Fixtures in the same (production) DLL as the classes they are testing, then run NUnit across all the production DLLs.

3. Create one ‘testing’ DLL for your application and put all your Fixtures and test classes in it. Run NUnit against this, which itself calls the production DLLs.

4. Create one ‘testing’ DLL for each production DLL. Run NUnit against these, which themselves call the production DLLs.

I’ve seen most of these used. At my client right now, we are currently using option (1) as it offers the easiest and quickest way of compiling and running all the Unit Tests in your application. The problems with it though are:

– it breaks the encapsulation between your projects

– if you miss something out in compiling your special DLL you end up not actually testing what goes into production.

Option (2) is what we are using in CruiseControl.NET. There are some benefits to this choice:

– You only need to compile production DLLs

– Your tests are available in production for debugging if necessary

This second benefit is something I was specifically discussing with a colleague the other day, as I think it maybe a drawback of this method. To me, I have a bad feeling from a security and efficiency point-of-view about putting test-classes into production. That said, if you are developing a bespoke server application deployed locally (which is the situation which would benefit most for such debugging opportunities), and you have tight security, maybe this is worthwhile.

A drawback of options (1) and (2) is that for development’s sake, you’d probably create sub-namespaces for testing. e.g. If I’m testing the Sheep class in the Farmyard.Animals namespace, I’d probably create a SheepTest class in the Farmyard.Animals.Test Namespace. This means that in my SheepTest class I need to add a using statement for the actual Namespace I’m testing.

Options (3) and (4) are similar to each other in that both allow you to test the real production DLLs, while at the same time allowing you to only deploy production classes to production. They also both enable you to write your test classes in a separate project and DLL, but at the same time use the same Namespace as the target class (the real DLL doesn’t depend on the test DLL, so ‘Intellisense’ still works nicely.) If you choose not to deploy these DLLs to production, you can always save the binaries, or recompile later, should you want to run the tests in a production environment.

The drawback of these options is you end up compiling more DLLs, and having more Visual Studio projects, than you have production DLLs.

Weighing-up between options (3) and (4), we see the following comparisons:

– Option 3 allows you to run all the tests for your application by running NUnit against just one DLL, which itself is useful for development speed.

– Option 4 more closely models the componentization of your application. This means you have a natural componentization for your test DLLs, you can easily run just the test fixtures for one component, and if you move components between applications you can more easily move the unit tests. That said, if you change the structure of your components, you also need to change the structure of your test DLLs.

So, which of these to use? Well, more and more I’ve been using the <solution> task in NAnt to compile my .NET projects. It naturally fits, and works well, with the Visual Studio .NET model of an application. VS has its drawbacks, but more and more I see the benefit of a build tool that works closely with it, and it also does model intra-application / inter-project dependencies quite nicely. When using a combination of <solution> and Visual Studio, having extra projects really is little overhead. Therefore, the ‘extra-baggage’ drawback of options (3) and (4) are negated to the extent that I think they feel the ‘right’ thing to do. Right now, I prefer option (4) due to its better fit of the application, but it really is a close call to whether it is better than option (3).

Spam Virus Gone Crazy

As reported all places here there’s a new strain of the ‘sobig’ spam virus circulating today. And I’m being hit bad. My own laptop is not infected, but through no fault of my own in the last 3 hours I’ve been receiving the spam virus at a rate of roughly one per minute, each of which carries the 100kbyte payload.

That means my account will receive more than 100 Megabytes of spam today!!!

This is the latest variant in a succession of the virus, and it has me worried. It looks to me like virus-caused spam is increasing at a huge rate. If no measures are found to stop this kind of thing soon enough, and such virusses grow at a higher rate than the available bandwidth of the internet, what happens?

OK, I’ll stop being apocolyptic now and get back in my box.

Popping the Why Stack

I while ago in a previous company of mine a new guy had started on the team. He was discussing with a team member part of the architecture of the application. I can’t remember exactly how the conversation went now, but it was something like:

Newbie: So why are you using Stateful Session Beans It may not have been stateful session beans, but they’re always a good thing to question 🙂

Old Hand: Because they’re better to use than Entity or Stateless beans in this context.

Newbie: Why?

Old Hand: Because they’re the standard practice in this case.

Newbie: Why?

Old Hand: Because we need to track session state

Newbie: Why?

Old Hand (starting to get annoyed now): Because otherwise the customer would lose their session across requests

Newbie: Why is that a problem?

Old Hand (volume rising): Because we need to give the next page dependent on what the customer did in previous pages.

Newbie: Why?

Old Hand (down-right stressed now): Because we are writing a shopping application and as a customer I want to put something in my shopping basket so that its still there when I go to another page!!!!

Newbie: Oh, right, thanks 🙂 !

So lets disect this a little. We have a new developer (who’s pretty clever as it happens) and he’s asking a bunch of what looks like inane questions to someone who’s been on the team ages. At the start of the conversation the newbie is getting a whole load of technical detail, but want they really want to know is what customer story is driving this design decision. It annoys the heck out of the ‘old hand’ since they know the application inside out, but this popping the Why Stack is valid because:

– Its a valid way for new-comers to see how design, and customer stories, are related

– If you can’t eventually pop-out to a story it probably means that you have some un-used architecture in your application

– If its describing why a new piece of architecture needs to be added to the system then it can be used to show what customer story should apply to that new technical addition. e.g. :

Tech Lead to Project Manager: We should spend 2 weeks adding encryption to this module

Proj Manager: Why?

Tech Lead: To make our application more secure.

Proj Manager: Why do we need that?

Tech Lead: Because as a customer I want to be able to add my Credit Card details to my account so that they can’t be seen by any hacker who may be trying to get in the system.

Proj Manager: Oh, ok!

Good Practice rather than Best Practice

I’m a bit of a stickler when it comes to use of language. I’m not very good at it, but I appreciate the importance of trying to use it in a good way. Communication often leads to an emotional response and as a communicator you want the people you are projecting to to have the response you were aiming for. As a responsible communicator you have a responsiblity not to attempt to invoke a response that is invalid.

As an example of all this, take the phrase Best Practice which seems to be in fashion in the software industry. It doesn’t sit quite right with me, and I think I’d rather use the alternative phrase Good Practice. Why?

Best Practice is often used to describe a technique out of any particular context. In most cases there are often many alternative techniques and you can only pick the best one once you have a context to choose it within. Martin‘s book on Enterprise Patterns stresses this point. Transaction Script, Domain Model, and Table Module are all Good Practices but the best one depends on the context of the application you are developing. As a responsible communicator I should not say that one of these is always the best.

Best Practice can also invoke a negative response in the listener. By saying ‘You should use pair programming because it is a Best Practice’, I am implying that my opinion is better than your’s, no matter what the reason for you not using pair programming. Before we even start discussing why I think pair programming is useful, you may have already got a negative emotion towards the conversation and are therefore less likely to accept what I’m going to say.

I’ve used the phrase Best Practice frequently in the past. I’m going to try just to use Good Practice in future.

Formal Methods versus TDD

Back when I was taking my Computer Science degree at University I learned all about ‘Formal Methods’ (when I wasn’t playing pool, badly, that is). The main reason to use Formal Methods is to prove the correctness of programs. By stating a logical pre-condition for a block of code you can apply a sequence of logical steps to it that each correspond to the behaviour of each of the statements in the block. If when you reach the end of your block your transformed pre-condition is the same as the logical post-condition then your block is proven correct. If you do this for your entire program then it as a whole is proven correct.

So why doesn’t anyone really do this outside of academia? Basically because:

– Coming up with pre- and post- conditions that represent exactly what you want your program to do is hard

– The actual process is equally hard

– .. meaning that the whole thing is slow

OK, now flip to what I do now. I’m a big fan of ‘Extreme Programming’ techniques including Test Driven Development (TDD) which is an evolution of unit testing. Why do we do unit testing? Partly to try and ‘prove’ that each part of our program works. Its not really absolute proof since we never know if our unit tests uncover all possible bugs in the code, but its a good approximation. In fact as we add more unit tests it becomes an even better approximation of proof.

Anyway, why am I rambling about this? Well, the academic in me (which isn’t allowed out very often) always had a bit of a bad feeling about unit testing due to its lack of proof. But recently I was talking to a couple of ThoughtWorkers about analytical and numerical processes in Maths. With analysis (or calculus) you prove the answer to a problem using Mathematical logic. Some problems though are either (1) too hard to solve like this or (2) take too long to solve so you can also use numerical methods. These never give you the exact solution but through a succession of repeated calculations you can tend to (or hone in on) the solution until you have an answer that is ‘good enough’.

To me, this relationship of analysis and numerical methods seems strikingly similar to that of formal methods and unit testing. Since numerical solutions are academically valid this makes the academic in me happier about unit testing.

TDD isn’t just about testing though – its also a method of actually deciding how to write a program. You start with a failing test, write some code that fixes this test in the simplest way possible, refactor your program to remove duplication, and then repeat. In my Formal Methods studies at University I also saw how you could actually write a program based on a logical specification and a set of rules related to the ‘correctness proving’ rules. (See here). So both TDD and Formal Methods offer a way of ‘generating’ code based on testable specification (programmatic assertions for TDD, logical specifications for Formal Methods.)

In conclusion, this argument itself doesn’t prove anything (!) but in my own mind at least showing a relationship between Formal Methods and TDD strengthens the case for TDD being a valid, and valuable, technique for writing software.