Unit testing in F# is a breath of fresh air

I want to start this post off by saying that I was simply blown away with unit testing in F#. Before I go on I want to give reference to the excellent F# for fun and profit blog for getting me started.

Unit testing and F# are a match made in heaven. I think the best thing to do before I go on is to show you a unit test in C#:


public class AllIndexOfTests
{
    [Test]
    public void AllIndexOfhellInHelloHelloIsCorrect()
    {
        var allIndexOf = new AllIndexOf();
        var result = allIndexOf("hello hello", "hello");
        result.SequenceEqual(0, 6).Should().BeTrue();
    }

}

Now in F#:

[<Test>]
    let ``the string 'hello hello' contains the string 'hell' at indexes 0 and 6``() =
        allIndexOf "hello hello" "hell" |> should equal [0; 6]

Check out how the F# code is self describing. Due to the fact that in F# you can put whitespace in a function by escaping the name with the double back tick, the function name can be self describing with spaces which makes it readable. To get the C# code to look as clean as above requires the use of quite a lot of libraries and careful design of your code. Even then you can’t get that close. The F# just reads like English. I would go as far as to say someone who did not understand code could read that test and understand what the function should do.

The testing framework I am using to obtain this syntax is FsUnit. Which is built on top of NUnit. Here are the complete set of tests for the GridSearch solution to the HackerRank problem I went through last time:

module allIndexOf =
    [<Test>]
    let ``the string 'hello hello' contains the string 'hell' at indexes 0 and 6``() =
        allIndexOf "hello hello" "hell" |> should equal [0; 6]

    [<Test>]
    let ``the string 'hello world' does not contain the string 'kevin' ``() =
        allIndexOf "hello world" "kevin" |> should equal []

module find =
    [<Test>]
    let ``the string '123' is in the grid '444 123 444' at [(0,1)]``() =
        find ["444"; "123"; "444"] "123" 0 |> should equal [(0,1)]

    [<Test>]
    let ``the string '123' is in the grid '444123 981231 123122' at [(0,1)]``() =
        find ["444123"; "981231"; "123122"] "123" 0 |> should equal [(3,0); (2,1); (0, 2)]

    [<Test>]
    let ``the string '123' is not in the grid '123 777 444' when the first row is skipped ``() =
        find ["123"; "777"; "444"] "123" 1 |> should equal []

module findExact =
    [<Test>]
    let ``the string '123' is in the grid '444 123 444' is exactly at 0, 1``() =
        findExact ["444"; "123"; "444"] "123" 0 1 |> should equal true

    [<Test>]
    let ``the string '77' is in the grid '9999 9977 0987 0987' is exactly at 2, 1``() =
        findExact ["9999"; "9977"; "0987"; "0987"] "77" 2 1 |> should equal true

    [<Test>]
    let ``the string '865' is not in the grid '9999 9977 0987 0987' is exactly at 2, 1``() =
        findExact ["9999"; "9977"; "0987"; "0987"] "865" 2 1 |> should equal false

    [<Test>]
    let ``the string '77' is not in the grid '9999 9977 0987 0987' is exactly at 3, 3``() =
        findExact ["9999"; "9977"; "0987"; "0987"] "77" 3 3 |> should equal false

module gridContains =
    [<Test>]
    let ``the grid '444 123 444 555' contains the grid '123 444 555 ``() =
        gridContains  ["444"; "123"; "444"; "555"] ["444"; "123"; "444"] |> should equal true

    [<Test>]
    let ``the grid '44488 12311 44490 55598 778899' contains the grid '44 55 ``() =
        gridContains  ["44488"; "12311"; "44490"; "55598"; "778899";] ["44"; "55"] |> should equal true

    [<Test>]
    let ``the grid '44488 12311 44490 55598 778899' does not contain the grid '88 88 ``() =
        gridContains  ["44488"; "12311"; "44490"; "55598"; "778899";] ["88"; "88"] |> should equal false

The fact this fits in such a small space, is succinct and to the point makes it such a good fit for unit testing. I would even argue that even if you use C# for your main application writing your tests in F# could save you time and make your tests much more readable.

If you look at the code listing you will see that I have created the tests all inside the module GridSearchTests. I have then grouped each set of tests under a module that is named the same as the function they are testing. This means in the resharper test runner you get the following:

tests

Which I think is really sweet. Each function is listed along with the test cases that can be read in normal English. Rather than in C# you often name your test cases using camel casing. Meaning that when the test cases become long it can be hard to read them.

To check out the code from this post clone my HackerRankInFSharp repository on github. Please feed back your F# unit testing experiences in the comments.

Grid Search Problem Solved Using F#

As mentioned in my last post I’ve recently got into solving HackerRank problems in F#. I’m trying to use F# with a functional programming style not writing in an imperative style as I would in C# and use the F# syntax (which is possible).

I want to go through how I solved the Grid Search Problem so if you don’t want to know a solution look away now!

The problem is that you are given two grids of numbers and you have to write “YES” if the second grid is contained anywhere in the first grid and “NO” if it isn’t.

I’m quite new to functional programming but the way I approached this was to break the problem up into smaller functions and then the answer would be a composition of those smaller functions.

If you clone my HackerRankInFSharp repository from github it contains all of the code that you can follow along to. To view the code for this problem open the file GridSearch.fs in the HackerRank project.

At the bottom of the GridSearch.fs file you will see a solution function defined. This function runs the test case from the input file and is where you should start reading the file from.

When you run your code on hackerrank your program needs to interact with the console. To save typing the inputs in each time I wanted to test my program I copied the test cases into a text file and then simply mapped a read function to reading a line from a file like so:

let reader = new StreamReader("TextFile1.txt")
let read = reader.ReadLine

This is one of the neat things about F#. As it’s a functional language and functions are first class citizens you can assign them into variables and then just pass them around. By writing my code in this way I could simply copy all of my solution into HackerRank when it is complete and just change the read function to map to console readline (let read = Console.ReadLine) and leave the rest of my code the same.

The next three functions that I wrote are to help read the input data:

let t = read() |> Convert.ToInt32

let num = fun() -> read().Split(' ') 
                |> Seq.head
                |> Convert.ToInt32

let fetch = fun() -> [for _ in 1..num() do yield read()]  

The first two simply read the values for t (the amount of test cases) and num I use to be the amount of rows in the grid. The input data also gives you the length of each row but as the rows are separated by new lines you don’t need this. The last function fetch returns a sequence of strings of length num (the amount of rows in the grid). The function fetch will be used to read a grid into a variable and is generic so can be used both for the reference and search grids. The terminology I’m going to use is reference grid to refer to the bigger grid and search grid will be what we are looking for inside the reference grid.

Now I’ve got the input my idea was to write a function get all indexes of a smaller string in a bigger string. The reason for this is that I thought to search for the search grid inside the reference grid we would first start with just the first line of the search grid. We want to look at each line of the reference grid and know if the first line of the search grid is contained within it. If it is we want to a list of the indexes of where it is.

let allIndexOf (str:string) (c:string) =
    let rec inner (s:string) l offset =
        match (s.IndexOf(c), (s.IndexOf(c)+1) = s.Length) with
        | (-1, _) -> l
        | (x, true) -> (x+offset)::l
        | (x, false) -> inner(s.Substring(x+1)) ((x+offset)::l) (x+1)
    inner str [] 0

In my last post I talked about the awesome help I got off the community with this function. This function works recursively which is one of the mantras of functional programming. allIndexOf takes 2 strings. The first string is the reference string (str) and the second string is the one you are looking for (c). The first line of the function declares a recursive function that takes a string (s) a list of int (l) and an int (offset). There are a few interesting points on this line. Firstly the keyword “rec” is not optional, it means it is a recursive function. The next interesting part is that the parameters l and offset do not have any type annotations. This is one of the things that makes F# very clean. The reason for this is that the compiler can figure out the types of the functions nearly all of the time for you so the type annotations are completely optional. This removes a lot of the excess clutter. The next line (match) is another very cool feature of F# The pattern matching. What it is saying is create a tuple that as the values (s.IndexOf(c), (s.IndexOf(c) +1 = s.Length)) which is of type (int, bool). Once that tuple is created it is then matched on one of the next three lines. If s.IndexOf(c) returns -1 then it doesn’t matter what the second part of the tuple is as we match with the wildcard _. In that case we return l the list that was passed in. If the s.IndexOf(c) evaluates to anything other than -1 then we match the 2nd line if the bool part is true. If the bool part is false then we match the last line. The last line of the function then calls the recursive function with the string you have passed in, an empty list and an offset of 0.

To see how this function works lets look at a simple example if we pass in “hello hell” and “hell” we would expect the indexes returned to be 0, 6. When we call the function inner with “hello hell” [] 0. The match clause will then produce the tuple (0, false) which will match to the third line. This will then call inner again with “ello hell” [0] 1. This time the match clause will evaluate to (2, false) which will then call inner with “ell” [0, 6] 6. This time the match clause will return the tuple (-1, false) so it will match with the first line of the match clause and return the list [0, 6] which is the correct answer.

The way you can combine functions together to perform the tasks you need will become clear as I go through the rest of the solution. The next function uses the allIndexOf function to return the co-ordinates of every occurrence of a string in a list of strings (aka a grid).

let find g str start = g  |> Seq.skip start
                          |> Seq.mapi (fun i x -> (allIndexOf x str) |> Seq.map (fun u -> (u, i+start))) 
                          |> Seq.collect (fun x -> x)

You can read the |> operator as pipes data from left to right. What we are doing here is for each row in the grid skipping as many rows as are passed in via the start parameter (note this is an optimisation as further on we can reuse this function to search the remainder of the grid). We then pipe the result from that into a map function that returns a tuple where the first element is the index of the string (x co-ordinate) and the indexer is the y co-ordinate of the string. The y co-ordinate of the string is known as we are looping through each string in the grid using the mapi function. The mapi function gives you each element (x) and the index of the element (i). The index of the element will be the y co-ordinate as that corresponds to the row in the grid. We then have to use the Seq.collect function to unwind a double sequence. This is the equivalent to a select many.

let findExact g str x y = find g str y
                              |> Seq.exists(fun (a, b) -> x = a && y = b)

The findExact function then extends the find function to return a bool if and only if that string is found in the grid at the co-ordinates specified. The reason for this function is my idea to solve the problem is to call the find function for the first row of the search in the reference grid giving you all of the co-ordinates of where that appears. Then all you need to do to tell if the whole search grid is there is loop through subsequent rows and add one on to the y co-ordinate each time. For example if we called find and we got back (2, 3) and (5, 6) and the search grid was 3 strings deep. Then to tell if the search grid is in the reference grid we would simply need to know if the second string was at (2, 4) and the third at (2, 5) or if the second string was at (5, 7) and the third at (5, 8). In either case the answer is yes, if any of those return false the answer is no.

let gridContains g (s: string list) = 
    let matches = find g s.[0] 0 
    let numStrs = s |> Seq.skip 1
        |> Seq.mapi(fun i line -> (i, line))
    matches |> Seq.map(fun (a, b) -> numStrs |> Seq.map(fun (i, line) -> findExact g line a (b+i+1)))
        |> Seq.exists(fun b -> b |> Seq.forall(fun x -> x))

The function above does exactly that. The first row assigns all of the co-ordinates of the first row of the search grid that are found in the reference grid into the variable matches. The next line essentially numbers each string in the search grid but not the first row making it easier to write the last part of the function. The last part of that function then pipes matches into a function that for each co-ordinate in matches adds on i (the current loop value) to the y value and 1 (due to 0 starting value) and returns a bool as to whether that string is contained in that position. That will then return a sequence of sequence of bool. The last part is then to see if any of those sequences of bool contain a set of all true. It even reads how I’ve just described it “does a sequence exist where there is a sequence for all of which are true”.

To run the solution clone theHackerRank repo on github. Then in the program.fs file uncomment the line //GridSearch.solution.

I’m still new to functional programming so if there is anything you would improve or do different please feel free to leave a comment below.

The StackOverflow community rocks!

Recently I stumbled upon a site called HackerRank. Before I go into why this relates to StackOverflow I want to briefly explain HackerRank. HackerRank is a site where you are challenged to solve a problem in pretty much any language you want. You have to take a program that gets an input and gives a desired output based on a problem. The site will then run your code against some test cases and determine if your code passes that problem.

HackerRank gets very addictive quickly. The problems start off easy such as add two numbers to get you used to how it all works but they ramp up pretty fast.

For a while now I’ve had a passing interest in F#. I’ve used the language to solve Project Euler problems in that past. HackerRank in my view is a better Project Euler the reason I think this is because with the Project Euler problems many of them can be brute forced on today’s hardware. Meaning you can solve them in a way that the author didn’t quite intend. The other issue with some of the problems is they are deep in maths. So deep in maths that I sometimes needed to spend quite a bit of time reading up on advanced algebra just to get to the point where I could start the problem. As much as I enjoyed this it detracted from the problem solving. Anyway I digress, I decided that HackerRank would be an excellent way to learn F#.

I was trying to write a function allIndexOf that takes two strings and returns a list of integers of the first positions of all of the occurrences of the second string in the first string. I couldn’t get my function to compile in F# so I posted this question on StackOverflow. Within 2 minutes someone had posted the answer that I missing a set of parentheses due to the way that F# binds it’s function arguments. Now is cool that you can get an answer within 2 minutes. What was even more amazing and blew me away was that someone then in the comments took the time to point out a bug in my function.

Here is my original function see if you can spot the bug:

let allIndexOf (str:string) (c:string) =
    let rec inner (s:string) l =
        match (s.IndexOf(c), (s.IndexOf(c)+1) = s.Length) with
        | (-1, _) -> l
        | (x, true) -> x::l
        | (x, false) -> inner(s.Substring(x+1) x::l)
    inner str []

Not only that but they made this F# Fiddle showing my function with some inputs and why it was wrong and also giving 3 ways to do the function. Using comprehension, recursion with an accumulator and recursion with a continuation.

The bug was that during the recursive call to “inner” if it matched then the value that x would be bound to would not be the index of the next occurrence of the string in the original string but the truncated string. To fix this in my version of the code I just needed to pass along the amount of string that was missing so this could be added on to x.

Fixed code:

let allIndexOf (str:string) (c:string) =
    let rec inner (s:string) l offset =
        match (s.IndexOf(c), (s.IndexOf(c)+1) = s.Length) with
        | (-1, _) -> l
        | (x, true) -> (x+offset)::l
        | (x, false) -> inner(s.Substring(x+1)) ((x+offset)::l) (x+1)
    inner str [] 0

Now this code is actually inefficient due to the fact I am creating a new string upon each recursive call. This is something Sehnsucht points out on StackOverlow. I think it says a lot about the dev community that people are willing to firstly give up their time to answer your questions but also that they will go the extra mile to point out where you are going wrong and how you can improve.

If you have some spare time why not get on StackOverflow and pay it back…

An easy way to test custom configuration sections in .Net

Due the way the ConfigurationManager class works in the .Net framework it doesn’t lend itself very well to testing custom configuration sections.  I’ve found a neat solution to this problem that I want to share.  You can see all of the code in the EasyConfigurationTesting repository on github.

Most of the problems stem from the fact that the ConfigurationManager class is static. A good takeaway from this is that static classes are hard to test.

One approach to testing your own custom config section would be to put your config section in an app.config inside your test project.  This would work to the extent you could read the configuration section from it and test each value but it would be hard, error prone and hacky to test anything other than what was in the app.config file when the test ran.

In the config demo code that is on github the config section we are trying to test has a range element allowing you to set the min and max.  We want to test that we can successfully read the min and max values from the config and what happens in different scenarios like if we omit the max value from the config section.  Take a look at how readable the following test is:

TestAppConfigContext testContext = BuildA.NewTestAppConfig(@"<?xml version=""1.0""?>
            <configuration>
              <configSections>
<section name=""customFilter"" type=""Config.Sections.FilterConfigurationSection, Config""/>
              </configSections>
              <customFilter>
                <range min=""1"" max=""500"" />
              </customFilter>
            </configuration>");


var configurationProvider = new ConfigurationProvider(testContext.ConfigFilePath);
var filterConfigSection = configurationProvider.Read<FilterConfigurationSection>(FilterConfigurationSection.SectionName);

filterConfigSection.Range.Min.Should().Be(1);
filterConfigSection.Range.Max.Should().Be(500);            

testContext.Destroy();

Notice how we can pass in a string to represent the configuration we want to test, get the section based upon that string, assert the values from the section and then destroy the test context.

Under the covers this works by creating a temporary text file based upon the string you pass in. That temporary text file is then set as the config file to use for ConfigurationManager. The location of the temporary file is stored in the TestAppConfigContext class which also has a helpful destroy method to clean up the temporary file after the test is complete.

By using the builder pattern it makes the test very readable. If you read the test out loud from the top it reads “build a new test app config”. Code that describes itself in this way is easy to understand and more maintainable.

A cool trick that I use inside the builder is to implement the implicit operator to convert from TestAppConfigBuilder to TestAppConfigContext. This means that instead of writing:

var testContext = BuildA.NewTestAppConfig(@"...").Build();

You can write:

TestAppConfigContext testContext = BuildA.NewTestAppConfig(@"...");

Notice on the second row you can omit the call to Build() because the implicit operator takes care of the conversion (which incidentally is implemented as a call to build). In this case you could argue that you like the call to Build() as it means you can use var rather than the type but I personally prefer it.

With this pattern we can easily add more tests with different config values:

TestAppConfigContext testContext = BuildA.NewTestAppConfig(@"<?xml version=""1.0""?>
            <configuration>
              <configSections>
                <section name=""customFilter"" type=""Config.Sections.FilterConfigurationSection, Config""/>
              </configSections>
              <customFilter>
                <range min=""1"" />
              </customFilter>
            </configuration>");


var configurationProvider = new ConfigurationProvider(testContext.ConfigFilePath);
var filterConfigSection = configurationProvider.Read<FilterConfigurationSection>(FilterConfigurationSection.SectionName);

filterConfigSection.Range.Min.Should().Be(1);
filterConfigSection.Range.Max.Should().Be(100);

testContext.Destroy();

In the test above we are making sure we get a max value of 100 if one is not provided in the configuration. This gives us the safety net of a failing unit test should someone update this code.

I think this pattern is a really neat way to test custom configuration sections. Feel free to clone the github repository with the sample code and give feedback.

Exception Caught TV – Introduction to Git

At work I was asked to give a presentation on git, an intro presentation aimed at someone who has never used git before.  Whilst preparing for the presentation I was doing a lot of practicing.  Whilst practicing I had the idea of recording the practice sessions and turning them into a video series and sharing them online.

The benefits of this are two fold as firstly it gives me a great way to practice my presentation material and secondly it leaves a permanent record that will hopefully help people out who want to learn git.  If there is good feedback from this series I may well make more videos so please give me any feedback you have, good and bad.

You can access all of the videos on the new exception caught TV page.  For connivence here are the links:

Accessing localhost from the internet

On the latest edition of the excellent Javascript Jabber podcast one of their guests has suggested an excellent tool in their picks section. The tool is called ngrok. Ngrok makes your localhost available on the internet! Very handy if you want to test you local site on a mobile, tablet or share with someone outside of your network.

If you are interested you can download the tool from:

https://ngrok.com/download

There is a version available for Mac, Windows, Linux and ARM. The program is really simple to use. On the Mac you simply unzip and run by running ./ngrok from the terminal. This then gives the following output:


ngrok                                                                          (Ctrl+C to quit)

Tunnel Status                 online
Version                       1.7/1.6
Forwarding                    http://<address>.ngrok.com -> 127.0.0.1:80
Forwarding                    https://<address>.ngrok.com -> 127.0.0.1:80
Web Interface                 127.0.0.1:4040
# Conn                        0
Avg Conn Time                 0.00ms

Then you simply go to the address listed and you will be forwarded on to your machine on port 80. Note I have removed the address above.

From reading the documentation ngrok gets much cooler. It will give you a full detailed log of everything it’s captured on http://localhost:4040, allow you to replay requests, capture requests, forward requests on to another machine on the network and much more.

You can check out the source on github. Looks like the main contributer was Alan Shreve, great work Alan. Check out his website. I love the fact the community produce cool tools like this!

Handy powershell script for deploying nuget packages

I was recently presented with the problem to automate the deployment of a nuget package to a server. Now before you guys say I know that Octopus Deploy is the best way to go and I completely agree. At the moment at the organisation I’m at using a product like Octopus has to go through an “approval” process. Faced between the choice of having to carrying on deploying using file explorer (which is what the guys currently did here) and automating it, I voted for the latter option.

Now Team City is not really geared towards doing deployments. After all it’s designed to be a build server and that’s what it does extremely well. However we can make it work.

The first thing you need to do is package up your built web site into a nuget package and deploy it to a nuget server (either a local one or the public one). For those that dont know a nuget package is literally a zip file, you can even rename it .zip and it will extract. The reason for using a nuget package is that it gives you a nice way to place it onto a server, version it and in the future plug in Octopus Deploy.

So onto the script:


param([string]$computerName, [string]$destinationDir, [string]$packName, [string]$packVersion, [string]$executeLocally)

Write-Host "computer name: $computerName"
Write-Host "destination dir: $destinationDir"
Write-Host "package name: $packName"
Write-Host "package version: $packVersion"
Write-Host "execute locally: $executeLocally"

$deployScript = {
param($dest, $packageName, $version)

$basePath = "d:\packages\"

$packageNameAndVersion = "$packageName.$version"
$extractDir = $basePath + $packageNameAndVersion

Write-Host "deploying to: $dest"
Write-Host "extracting to: $extractDir"
Write-Host "package: $packageName"
Write-Host "version: $version"

$zipDownload = $packageNameAndVersion + ".zip"

$packageUrl = "http://nuget.org/api/v2/package/$packageName/$version"

if (-Not (Test-Path $extractDir))
{
New-Item -ItemType directory -Path $extractDir
}else{
Get-ChildItem -Path ($extractDir + "\*") -Recurse | Remove-Item -Recurse -Force -Confirm:$false
}

$Path = $basePath + $zipDownload

Write-Host "Downloading package..."
$WebClient = New-Object System.Net.WebClient
$WebClient.DownloadFile($packageUrl, $path )

Write-Host "Extracting package..."

$shell = new-object -com shell.application
$zip = $shell.NameSpace($path)

foreach($item in $zip.items())
{
$shell.Namespace($extractDir).Copyhere($item)
}

Write-Host "Cleaning destination..."

Get-ChildItem -Path $dest -Recurse | Remove-Item -Recurse -Force -Confirm:$false

Write-Host "Deploying..."

Copy-Item ($extractDir + "\*") $dest -Recurse

Write-Host "Removing downloaded package from $path"
Remove-Item $path -Recurse -Force -Confirm:$false

Write-Host "Removing extracted package from $extractDir"
Remove-Item $extractDir -Recurse -Force -Confirm:$false
Remove-Item ("$extractDir\") -Recurse -Force -Confirm:$false

Write-Host "Done."

}

$output = Invoke-Command -ScriptBlock $deployScript -ArgumentList $destinationDir, $packName, $packVersion

Now when you configure your team city build you need to pass in the following 3 parameters: $destinationDir, $packName and $packVersion. The destination dir is the directory you wish to deploy to e.g. \\192.168.1.1\WebSites\MyWebSite. The package name is the package id of the nuget package and the packVersion is the version number of the nuget package. That’s it! Simply huh?

You will need to make sure that the Team City user account has write access to the folder you are deploying to.

deploy

In the screenshot above you can see how to setup the powershell script to run. You can either check it into your source control system and then enter the path to the file in the script file dialog above or alternatively you can copy and paste the source directly into Team City. I would recommend having the powershell script in source control as then if you have multiple deploy builds changing the script will mean every build will get the new powershell script.

In the script arguments dialog you need to pass in the parameters to the script described above.  You can either hard code these (not recommended) or you can use Team City parameters and then present the user with a dialog when they go to run the build.  To do this I use the following parameters:

%website-deployment-dir% %package-name-to-deploy% %package-version-to-deploy%

After you have saved this, Team City will prompt you to setup these parameters. Go to the parameters section of your build configuration. Click on one of the parameters. Click spec, then in the dialog that appears next to display select “prompt”. You can enter a default value if you wish or even give the user a dropdown select with for example every website they can deploy. By using the prompt option when the user clicks to run the build they will be presented with a dialog that will force them to enter these parameters.

So there you have it, a quick and simple build that can deploy any nuget package to any server. Simples.

As always if anyone has any questions or wants to go through anything feel free to contact me. Happy deployments!

Don’t just throw it over the wall

Something that really gets to me as a developer is when I hear developers say “throw it over the wall to the testers”. A good software team should be just that… a team. You are one team all helping to achieve the goal ie deliver the software.

Most companies now use Agile Scrum methodology and have cross functional teams that consist of a number of developers, testers, BAs and a scrum master. It annoys me when developers and testers sometimes consider each other enemies. Some developers treat testers like an annoying teacher marking their homework. Testers can be just as bad.

We all need to remember that we are part of one team. If you think of it like an F1 team where would the driver be without the mechanics to change the tyres, build the car and test it? No one skill can complete the task on their own so let’s all work together to get the job done.

Importance of TDD

Last post I wrote about a series renamer that I wrote in node using TDD. The other day I went to use the renamer on a bunch of files and found that it threw the following error:

TypeError: Cannot read property '1' of null
      at Object.getShowDetails (nameParser.js:25:47)

From looking at the show titles I was renaming I realised there was a filename that I had not taken into account namely one like this:

cool_show_107.mp4

This title means that it is episode 7 of series 1. Previously the only formats accounted for were when the season and episode were prefixed with an ‘s’ and ‘e’ respectively. Which brings me round to today’s topic. How important it is to build your software using a TDD approach and the advantages it gives you.

Upon seeing this error I could tell straight away from the stack trace that the error was in nameParser.js. So the first thing I did was write a test to reproduce this error.


describe('and the series and episode number are in 3 digit format', function(){
    it('should return the information for the series and episode correctly', function(){
        var result = nameParser.getShowDetails('cool_show_102');
        result.seriesNumber.must.be(1);
        result.episodeNumber.must.be(2);
    });
});

When I run the tests after adding this unit test I get the same error so I know I have reproduced the defect. Now I have a failing test that once I get passing will mean the defect will never come back. Also note how easy this was for me to track down the defect because the code is decoupled and already heavily tested.

Once the test is written all that is left is to get it to pass by adding another regex to the nameParser to detect 3 consecutive numbers. Once this was passing I ran the renamer again on the list of files and found a different error. Some of the files were in this format:

cool_show_S01E07_encoding_x264.mp4

This caused an issue because 3 consecutive numbers were being picked up by the nameParser and because that regex was running before the one looking for seasons and episodes prefixed with ‘S’ and ‘E’ it was incorrectly giving season 2 and episode 64 for the above filename.

So to fix again using TDD this is very simple. Add another test reproducing the defect:


describe('and the series and episode number are specified and a 3 digit number appears after them', function(){
    it('should return the information for the series and episode correctly', function(){
        var result = nameParser.getShowDetails('cool_show_S01E02_cool_x264');
        result.seriesNumber.must.be(1);
        result.episodeNumber.must.be(2);
    });
});

To fix all I had to do was simply rearrange the order the regex statements run in the nameParser. 39 tests now passing. I then ran the series renamer on the files in question and it worked perfectly.

The lesson here is that by building software using a TDD approach it forces you to write decoupled components each with their own job. Which means that it’s easy to isolate defects when they occur and write tests that reproduce the defects. Once the test is passing that defect cannot reoccur without failing a test.

If you want to checkout the full code for the series renamer you can clone it on github.

TV Series Renamer Written in Node

I found myself in the situation the other day where I had a season of a TV show that needed renaming sequentially to clean up the file names.  The files were located in sub folders making it quite a laborious manual task to clean this up.  Step forward a node coding challenge.

I wrote the renamer using a fully TDD practice in node.  The finished program is blazingly fast.  The whole set of unit tests (37 at time of writing) including a test which creates dummy files, moves them, checks them and then cleans everything up (integration test) takes 51ms!

I used mocha for my unit tests.  Mocha using the same syntax as Jasmine (describe and it) giving a nice clean syntax.  There are a few extra features you get with mocha out of the box that you don’t get with Jasmine making it my preferred unit testing framework.  Before anyone shouts I know you can extend Jasmine to do the same thing but it’s nice just having all of these features there from the get go.

The first nice feature of mocha is being able to run a single unit test.  To do this simply change describe(‘when this happens…  to describe.only(‘when this happens…  By adding the ‘.only’ you are telling mocha to only run this describe block and any child tests of this block.  This can come in handy when you are working on getting one test passing on a big project and you dont want the overhead of running every test.  You can also use ‘.only’ on it statements which will only run a single test.

The next cool feature of mocha is the way that it handles testing asynchronous code.  Mocha gives you a done callback you can pass into a beforeEach or it statement.  You call done when you are done.  Mocha will automatically wait for done to be called before failing the test or time out if you never call done.

beforeEach(function(done) {
    seriesPath = __dirname + '/testSeries1/';
    fileFetcher.fetchFiles(seriesPath).then(function(data){
        result = data;
        done();
    });
});

The above code shows an example beforeEach block from the fileFetcher tests. The code fetches the files and in the ‘then’ handler it saves the result and then calls done. This is a very neat way of handling asynchronous testing.

Another awesome library that deserves a mention is Q. This library is renowned for being the best implementation of a promise library and makes dealing with asynchronous code much easier. Q allows you to store a promise when you call an asynchronous function rather like Task in .net. The promise allows the program to continue and then handle that result or error when it comes in.

There are two common ways that promises are handled. Either straight away by using a ‘then’ block or by storing the promise and then examining it later.

fileRenamer.generateRename('Great_Show', seriesPath, outputDir)
    .then(function(data){
        results = data;            
    });

var x = 11;

Above is an example of using a then handler. The function will be called by Q when the generateRename function returns something. The value that is returned by generateRename will be passed in to the data parameter in the anonymous function below. Note the line ‘x=11’ may execute before the line ‘results=data’. It is saying execute this then execute this.

The other way promises are normally used is by storing them and waiting on them later. This is especially useful if you have several tasks that can all run in parallel. You can set them all going storing the promise they give you back. When they all finish you can collect the results and continue on.

var promises = [];
promises.push(processDir(dir));

for(var i=0; i<files.length; i++){
    if (fs.lstatSync(dir + files[i]).isDirectory()){
        promises.push(processDir(dir + files[i]));
    }
}		

$q.all(promises).then(function(promiseResults){								
    deferred.resolve({episodes: _.flatten(promiseResults)});
});

The above code is a snippet from the fileFetcher.js file. processDir is being called upon each iteration of the for loop setting in motion a task to move a file. The promise returned by processDir is stored in an array of promises. Q gives us a method call ‘all’ which takes an array of promises and then you can use the same ‘then’ statement as shown above. This time the parameter passed into the handler will be an array with the results of all of the promises. The beauty is this array will be in the same order that you kicked them off in. Allowing you to marry up your results pretty cool!

So you might be wondering how to I return a promise from a function. Good question. You simply use the following pattern:

function Square(x)
{
    var deferred = $q.defer();
    deferred.resolve(x*x);
    return deferred.promise;
}

The function above will return a promise that will give you the result of squaring the parameter passed in. You would use it like this:

Square(3).then(function(result){
    console.log(result);
});

The last part of Q I want to talk about that is awesome is it’s ability to wrap up node functions and make them return promises. For example the rename file is fs.rename(oldPath, newPath, callback). Normally you would pass in a callback to call when the file is renamed. But then if you wanted to call another asynchronous function after and pass in another callback quickly your code would become a nested mess that would be hard to follow. Step up denodeify. Denodeify tasks a node function that calls a callback and makes it return a promise instead:

// instead of having a callback like this
fs.rename('test.txt', 'test2.txt', function(){
    console.log('done');
});

// you can wrap the function with q like this
var moveFile = $q.denodeify(fs.rename);

// then call it as you would any function that returns a promise
moveFile('test.txt', 'test2.txt').
    then(function(){ console.log('done'); });

This comes in especially handy when you are doing many operations in a loop. Like is being done in my fileRenamer.js file. I will leave the reader to examine the code below which combines all of the ideas explained thus far.


var performRename = function(showName, fromDir, toDir) {
    var deferred = $q.defer();

    generateRename(showName, fromDir, toDir)
        .then(function(files){				
            var promises = [];
            var moveFile = $q.denodeify(fs.rename);
            
			for(var i=0; i<files.length; i++){
                promises.push(moveFile(files[i].from, files[i].to));
            }
            $q.all(promises).then(function(){
                deferred.resolve(files);
            });
				
        });

    return deferred.promise;
};

If you want to read up more on q please see the documentation.

Please feel free to download the full source code of the series renamer from github. If you have any questions or comments feel free to give me a shout I will be happy to help.