Importance of TDD

Last post I wrote about a series renamer that I wrote in node using TDD. The other day I went to use the renamer on a bunch of files and found that it threw the following error:

TypeError: Cannot read property '1' of null
      at Object.getShowDetails (nameParser.js:25:47)

From looking at the show titles I was renaming I realised there was a filename that I had not taken into account namely one like this:

cool_show_107.mp4

This title means that it is episode 7 of series 1. Previously the only formats accounted for were when the season and episode were prefixed with an ‘s’ and ‘e’ respectively. Which brings me round to today’s topic. How important it is to build your software using a TDD approach and the advantages it gives you.

Upon seeing this error I could tell straight away from the stack trace that the error was in nameParser.js. So the first thing I did was write a test to reproduce this error.


describe('and the series and episode number are in 3 digit format', function(){
    it('should return the information for the series and episode correctly', function(){
        var result = nameParser.getShowDetails('cool_show_102');
        result.seriesNumber.must.be(1);
        result.episodeNumber.must.be(2);
    });
});

When I run the tests after adding this unit test I get the same error so I know I have reproduced the defect. Now I have a failing test that once I get passing will mean the defect will never come back. Also note how easy this was for me to track down the defect because the code is decoupled and already heavily tested.

Once the test is written all that is left is to get it to pass by adding another regex to the nameParser to detect 3 consecutive numbers. Once this was passing I ran the renamer again on the list of files and found a different error. Some of the files were in this format:

cool_show_S01E07_encoding_x264.mp4

This caused an issue because 3 consecutive numbers were being picked up by the nameParser and because that regex was running before the one looking for seasons and episodes prefixed with ‘S’ and ‘E’ it was incorrectly giving season 2 and episode 64 for the above filename.

So to fix again using TDD this is very simple. Add another test reproducing the defect:


describe('and the series and episode number are specified and a 3 digit number appears after them', function(){
    it('should return the information for the series and episode correctly', function(){
        var result = nameParser.getShowDetails('cool_show_S01E02_cool_x264');
        result.seriesNumber.must.be(1);
        result.episodeNumber.must.be(2);
    });
});

To fix all I had to do was simply rearrange the order the regex statements run in the nameParser. 39 tests now passing. I then ran the series renamer on the files in question and it worked perfectly.

The lesson here is that by building software using a TDD approach it forces you to write decoupled components each with their own job. Which means that it’s easy to isolate defects when they occur and write tests that reproduce the defects. Once the test is passing that defect cannot reoccur without failing a test.

If you want to checkout the full code for the series renamer you can clone it on github.

Advertisements

TV Series Renamer Written in Node

I found myself in the situation the other day where I had a season of a TV show that needed renaming sequentially to clean up the file names.  The files were located in sub folders making it quite a laborious manual task to clean this up.  Step forward a node coding challenge.

I wrote the renamer using a fully TDD practice in node.  The finished program is blazingly fast.  The whole set of unit tests (37 at time of writing) including a test which creates dummy files, moves them, checks them and then cleans everything up (integration test) takes 51ms!

I used mocha for my unit tests.  Mocha using the same syntax as Jasmine (describe and it) giving a nice clean syntax.  There are a few extra features you get with mocha out of the box that you don’t get with Jasmine making it my preferred unit testing framework.  Before anyone shouts I know you can extend Jasmine to do the same thing but it’s nice just having all of these features there from the get go.

The first nice feature of mocha is being able to run a single unit test.  To do this simply change describe(‘when this happens…  to describe.only(‘when this happens…  By adding the ‘.only’ you are telling mocha to only run this describe block and any child tests of this block.  This can come in handy when you are working on getting one test passing on a big project and you dont want the overhead of running every test.  You can also use ‘.only’ on it statements which will only run a single test.

The next cool feature of mocha is the way that it handles testing asynchronous code.  Mocha gives you a done callback you can pass into a beforeEach or it statement.  You call done when you are done.  Mocha will automatically wait for done to be called before failing the test or time out if you never call done.

beforeEach(function(done) {
    seriesPath = __dirname + '/testSeries1/';
    fileFetcher.fetchFiles(seriesPath).then(function(data){
        result = data;
        done();
    });
});

The above code shows an example beforeEach block from the fileFetcher tests. The code fetches the files and in the ‘then’ handler it saves the result and then calls done. This is a very neat way of handling asynchronous testing.

Another awesome library that deserves a mention is Q. This library is renowned for being the best implementation of a promise library and makes dealing with asynchronous code much easier. Q allows you to store a promise when you call an asynchronous function rather like Task in .net. The promise allows the program to continue and then handle that result or error when it comes in.

There are two common ways that promises are handled. Either straight away by using a ‘then’ block or by storing the promise and then examining it later.

fileRenamer.generateRename('Great_Show', seriesPath, outputDir)
    .then(function(data){
        results = data;            
    });

var x = 11;

Above is an example of using a then handler. The function will be called by Q when the generateRename function returns something. The value that is returned by generateRename will be passed in to the data parameter in the anonymous function below. Note the line ‘x=11’ may execute before the line ‘results=data’. It is saying execute this then execute this.

The other way promises are normally used is by storing them and waiting on them later. This is especially useful if you have several tasks that can all run in parallel. You can set them all going storing the promise they give you back. When they all finish you can collect the results and continue on.

var promises = [];
promises.push(processDir(dir));

for(var i=0; i<files.length; i++){
    if (fs.lstatSync(dir + files[i]).isDirectory()){
        promises.push(processDir(dir + files[i]));
    }
}		

$q.all(promises).then(function(promiseResults){								
    deferred.resolve({episodes: _.flatten(promiseResults)});
});

The above code is a snippet from the fileFetcher.js file. processDir is being called upon each iteration of the for loop setting in motion a task to move a file. The promise returned by processDir is stored in an array of promises. Q gives us a method call ‘all’ which takes an array of promises and then you can use the same ‘then’ statement as shown above. This time the parameter passed into the handler will be an array with the results of all of the promises. The beauty is this array will be in the same order that you kicked them off in. Allowing you to marry up your results pretty cool!

So you might be wondering how to I return a promise from a function. Good question. You simply use the following pattern:

function Square(x)
{
    var deferred = $q.defer();
    deferred.resolve(x*x);
    return deferred.promise;
}

The function above will return a promise that will give you the result of squaring the parameter passed in. You would use it like this:

Square(3).then(function(result){
    console.log(result);
});

The last part of Q I want to talk about that is awesome is it’s ability to wrap up node functions and make them return promises. For example the rename file is fs.rename(oldPath, newPath, callback). Normally you would pass in a callback to call when the file is renamed. But then if you wanted to call another asynchronous function after and pass in another callback quickly your code would become a nested mess that would be hard to follow. Step up denodeify. Denodeify tasks a node function that calls a callback and makes it return a promise instead:

// instead of having a callback like this
fs.rename('test.txt', 'test2.txt', function(){
    console.log('done');
});

// you can wrap the function with q like this
var moveFile = $q.denodeify(fs.rename);

// then call it as you would any function that returns a promise
moveFile('test.txt', 'test2.txt').
    then(function(){ console.log('done'); });

This comes in especially handy when you are doing many operations in a loop. Like is being done in my fileRenamer.js file. I will leave the reader to examine the code below which combines all of the ideas explained thus far.


var performRename = function(showName, fromDir, toDir) {
    var deferred = $q.defer();

    generateRename(showName, fromDir, toDir)
        .then(function(files){				
            var promises = [];
            var moveFile = $q.denodeify(fs.rename);
            
			for(var i=0; i<files.length; i++){
                promises.push(moveFile(files[i].from, files[i].to));
            }
            $q.all(promises).then(function(){
                deferred.resolve(files);
            });
				
        });

    return deferred.promise;
};

If you want to read up more on q please see the documentation.

Please feel free to download the full source code of the series renamer from github. If you have any questions or comments feel free to give me a shout I will be happy to help.