Writing the first test for a Real System, Part II

Guilt. That's what I've been feeling all these days.

Ok, it's not that I spent the past month in a deep depression. I should confess that I'm not that kind of guy. But naming a post "blah blah part I" is kinda making a commitment. And the longer I kept postponing writing the second part, the worse I was feeling about it. So, I finally decide that I won't leave my workplace until I finish it.

In Part I, I have defined some guidelines that would help me in writing my first test. What's missing is the requirements for this particular system (Chpokk, and online C# code editor) I'm building. So, let me remind you these guidelines, this time with practical application.

I want to produce some business value as soon as possible

What I want my application to do is something that people will find actually useful. "As soon as possible" means a Minimal Viable Product from your marketing course. Perhaps it means that "at least one person might find it useful for at least a small something". In other words, it might be good for a demo. It is not big enough to be used in the daily life, but at least it's better than having just an AbstractFactoryFactoryAdapter. So, instead of building Version One piece by piece, I want to build Version 0.01, not caring much about clean and maintainable code (I'm postponing it for the Refactoring phase), but this version, although being far from perfect in all aspects, will actually solve someone's problem. And of course, among all problems that my applicationis going to solve, I'm choosing the most important one.

This is also very good in terms of getting user's feedback as soon as possible. Since I'm solving a real world problem, there must be some user out there who has this exact problem, so if I provide a solution (for free), this user is going to be my best friend, providing valuable feedback for the rest of her life.

Back to the point

So, since I'm building an online code editor, the "something useful" is, well, being able to edit the code. What means "edit"? It means, load, change, save. In my case, "load" is "clone a git repository", since that's where most of the code is stored today (yes I know there are other source control systems, and some people actually use the one named "file system", but I decided to start with Git), and display a file in the editor. "Edit" is "change a file in an online editor", which involves intellisense, refactoring, autosaving and whatnot, but let's forget about it until later. "Save" involves.. ok, let's not talk about it now.

This is the point where I'm going to contradict myself.

Theoretically, I should cover all three parts with one test. The problem is, that would involve multiple user interactions, meaning I should probably add UI testing (in which I'm not very proficient), meaning I should decide on UI (which I don't want to yet), meaning I'll still have to do a lot of things at once. Instead, I split my task into several smaller ones, but I'm not doing it from the developer perspective (I'm not test-driving specific classes), but rather based on user actions.

The first such action is cloning a remote repository. I agree that there is not much business value in it, so I'd rather not showing it to end users (except maybe for my 5-year old Alice), but still it's something that can be shown to somebody -- "hey, this thing can clone a repository!"

I don't want my tests to be fragile.

While I don't want certain parts to be fully implemented yet, I also don't want my tests to be dependent on them. These implementation details should be hidden from my tests, and never be tested themselves.

The main reason for a test to be fragile, meaning it can suddenly stop passing while the application behaves correctly, is that it depends on some implementation details. Whenever such a detail changes without affecting the application (for example, as a result of refactoring), the test breaks. Such a test brings more harm than benefit, since we cannot rely on it when checking our application's health. It becomes a burden -- we have to support it in addition to supporting our main application.

A typical example is an UI test that relies on element id's. If we are doing Asp.Net WebForms, we often let the framework generate these id's for us. As a result, each time we change our control structure without affecting its functionality (say, add a new NamingContainer), the id's (which are implementation details) change, and our test fails. Don't think that Asp.Net MVC is free from that -- a simple Action Method renaming would make a test that depends on it fail.

In our case, one of the implementation details is the physical location of the local repository. I can guess that I should do something to prevent different users or projects use the same folder, but I don't want to care about it right now. So, what I'm doing is encapsulating it, making sure it doesn't leak. For now, I hardcode it to a particular value, but I don't use this value in my tests. So, my tests, just as my production code, use this value without actually knowing it. Later, when the implementation changes, the tests won't break, since they rely on the same implementation.

It also allows me, as agile gurus advice, delay the implementation until it's really necessary.

The tests should not use any knowledge from outside.

This one has been covered a bit in the previous post, but I'd like to elaborate on it. Sometimes you look at the assert, and you don't understand it at all. I mean, it doesn't have to be crystal clear (that's what the name of the test is for), but at least I should be (eventually) able to understand it, because if it's broken, I have to fix it somehow. Whenever a test depends on some external resources, it becomes totally unclear. In addition, it encourages reusing this external resource, making it complex to satisfy the needs of all tests (I'm looking at you, a test repository for LibGit2Sharp).

Of course, this doesn't apply to the infrastructure. Even unit tests sometimes  are better run with a real database, and we're talking integration tests here.

This is why, rather than taking a database that has some rows pre-inserted, I prefer taking a clean database and inserting the rows I need during the Arrange phase. Another example is when I need to parse a .sln file, rather than using an existing one, I'm creating it in the test. This way I'm free to create as many versions as I need for testing various cases, and I have a unique resource for each test, that contains only the stuff I need for this particular test. But the main benefit is that everything my test needs is just around the corner.

The requirements

Before we start writing our test, we should write down the requirements. As I keep insisting, our requirements shouldn't be developer-centric, i.e. they shouldn't be, like, "method xxx should return an instance of yyy with such and such property values". They should be formulated in terms of user expectations.

So, my task for today is to start implementing the "clone" story. I'm coming up with the following requirements:

  1. When a user clicks the "open project" button, a popup appears that invites her to enter the repository Url (we assume that the repository is publicly readable). After entering an Url and clicking "Ok", the users sees a progress indicator.
  2. After the process is finished, a user is redirected to the "Project" page.
  3. This page should display the list of the files.

Now, as I mentioned previously, I'd like to split this into several tests. For a particular test that I'll be writing now, which is going to be entirely server-side, I've got the following story:

If a user submits the Url of a remote publicly readable repository to our system, it should clone a repository to a folder where this user can access it later.

Note that I don't specify the exact location of the target folder here (see Rule #2). Instead, there's a rather vague "where.. can access it later", which is not clear how to program. Let's say this means the result of some property or method which will be introduced by our code.

Finally, the code

I was delaying it as long as I could, but finally here it is:

using System;
using System.IO;
using Chpokk.Tests.GitHub.Infrastructure;
using ChpokkWeb.Features.Exploring;
using ChpokkWeb.Features.Remotes;
using LibGit2Sharp.Tests.TestHelpers;
using MbUnit.Framework;
using StructureMap;
// Namespace should correspond to the feature we're developing 
// Later it's going to help me to find this particular test
namespace Chpokk.Tests.Cloning {
	public class WhenYouSendACloneCommandToAServer {
		private string _fileName;
		private string _targetFolder;
		// First, prepare the context
		public void Setup() {
			const string repoUrl = "git://github.com/uluhonolulu/Chpokk-Scratchpad.git";
			// Create a random filename
			_fileName = Guid.NewGuid().ToString();
			// Commit a new file to the remote repository
			var content = "stuff";
			// Prepare the target folder
			// This is where we get the relative repository path 
 			// See discussion about Rule #2
			var repositoryInfo = ObjectFactory.GetInstance<RepositoryInfo>();
			_targetFolder = Path.Combine(Path.GetFullPath(@".."), repositoryInfo.Path);
			// We cannot clone into a nonempty directory, so delete it
			if (Directory.Exists(_targetFolder))
			// ACT
			// Get an instance of our controller.
			// I'm using a container so that I don't have to rewrite it
			// each time I change the signature of the constructor.
			// For unit tests, use automocking container.
			var controller = ObjectFactory.GetInstance<CloneController>();
			// Create a model for using with our Action Method.
			// PhysicalApplicationPath is databound automatically,
			// but in out test we need to submit it.
			var model = new CloneInputModel 
				{PhysicalApplicationPath = Path.GetFullPath(".."), 
					RepoUrl = repoUrl};
			// Finally, execute the Action method.
		public void RepositoryFilesShouldAppearInTheDestinationFolder() {
			var expectedFile = Path.Combine(_targetFolder_fileName);
			var existingFiles = Directory.GetFiles(_targetFolder);
			Assert.AreElementsEqual(new[] { expectedFile }, existingFiles);

Your posts are too long, and I don't understand the main idea

Ok, here's the summary:

1. Produce the business value as soon as possible. It means, identify the main benefit that your product does, and write a test for it, end to end. You are allowed, however, to split it into several tests, but each one should represent a significant feature.

Benefits: When you make this test green, you actually have a working product. You can receive an early feedback on it, you can show it to your Mom, but what's most important, it feels great!

2. Don't let the implementation details leak into your tests. It means, whenever you have to use some knowledge in your test that's irrelevant to the end user, hide it behind a piece of production code that's implemented as simple as possible.

Benefits: You don't have to think about these details right now and go straight to goal #1, plus, and this is actually the most important thing, you get solid tests. Because one of the main reasons that tests break over time is that the implementation details change, and these tests depend on them.

3. Always prepare your context in the test code. It means, other than infrastructure, everything that your test uses should be created as part of it.

Benefits: Such tests are much easier to support. If your test breaks, at least you can be sure it's not because some external resource has been changed. In addition, it is much easier to figure out what's going on in this test.

Wrapping up

This post (both parts) take too long to write. Probably much longer than it deserves. I think I should have written an ebook instead, sell it for $37, and become rich. Anyway, I hope it made somebody's life less boring.

blog comments powered by Disqus

Latest blog posts

Powered by FeedBurner