Converting decimal numbers to Roman Numerals in C#

I decided to create a little project to implement converting decimal numbers to Roman Numerals in C#. You can solve this problem in quite a few different ways, I wanted to talk through the pattern based approach that I went for.

The Roman Numerals from 0 to 9 are as follows:

  • 0 = “” (empty string)
  • 1 = I
  • 2 = II
  • 3 = III
  • 4 = IV
  • 5 = V
  • 6 = VI
  • 7 = VII
  • 8 = VIII
  • 9 = IX

To start the project I wrote a set of tests that checked these first 10 cases. I like taking this approach as it allows you to solve for the simple base case, then you can refactor your solution to be more generic. Our first implementation that solves for the first ten numbers is:

public static string ToRomanNumeral(this int integer)
{
       
     var mapping = new Dictionary<int, string>
     {
         {0, ""},
         {1, "I"},
         {2, "II"},
         {3, "III"},
         {4, "IV"},
         {5, "V"},
         {6, "VI"},
         {7, "VII"},
         {8, "VIII"},
         {9, "IX"},
     };

     return mapping[integer];
}

Obviously there is no error checking etc we are just solving for the 10 first cases. I decided to implement the method as an extension method on int as it makes the code look neat as it will allow you to write:

    var romanNumeral = 9.ToRomanNumeral();

The next step is to look at the Roman Numerals up to 100 and see if we can spot any patterns. We don’t want to have to manually type out a dictionary for every Roman Numeral! If we look at the Roman Numeral representation for the tens column from 0 to 100 we find they are:

  • 0 = “” (empty string)
  • 10 = X
  • 20 = XX
  • 30 = XXX
  • 40 = XL
  • 50 = L
  • 60 = LX
  • 70 = LXX
  • 80 = LXXX
  • 90 = XC

We can straight away see that it is exactly the same as the numbers from 0 to 9 except you replace I with X, V with L and X with C. So lets pull that pattern out into something that can create a mapping dictionary given the 3 symbols. Doing this gives you the following method:

private static Dictionary<int, string> CreateMapping(string baseSymbol, string midSymbol, string upperSymbol)
{
    return new Dictionary<int, string>
    {
        {0, ""},
        {1, baseSymbol},
        {2, baseSymbol + baseSymbol},
        {3, baseSymbol + baseSymbol + baseSymbol},
        {4, baseSymbol + midSymbol},
        {5, midSymbol},
        {6, midSymbol + baseSymbol},
        {7, midSymbol + baseSymbol + baseSymbol},
        {8, midSymbol + baseSymbol + baseSymbol + baseSymbol},
        {9, baseSymbol + upperSymbol},
    };
}

We can now call the above method with the symbols for the column we want to calculate passing in the correct symbols. So for the units column we would write:

    var unitMapping = CreateMapping("I", "V", "X");

Now we have this we it is straight forward to create the mapping for the hundreds column. To complete our implementation we want to add some error checks as we are only going to support Roman Numerals from 0 to 4000. The full solution is now quite trivial. We simply check the input is in our valid range (between 0 and 4000). Then we loop through each column looking up the symbol for that column in our mapping dictionary that we generate using our function above. Then we simply concatenate the symbols together using a StringBuilder and return the result.

The full solution with all of the tests is available on my GitHub repository: https://github.com/kevholditch/RomanNumerals.

Importance of TDD

Last post I wrote about a series renamer that I wrote in node using TDD. The other day I went to use the renamer on a bunch of files and found that it threw the following error:

TypeError: Cannot read property '1' of null
      at Object.getShowDetails (nameParser.js:25:47)

From looking at the show titles I was renaming I realised there was a filename that I had not taken into account namely one like this:

cool_show_107.mp4

This title means that it is episode 7 of series 1. Previously the only formats accounted for were when the season and episode were prefixed with an ‘s’ and ‘e’ respectively. Which brings me round to today’s topic. How important it is to build your software using a TDD approach and the advantages it gives you.

Upon seeing this error I could tell straight away from the stack trace that the error was in nameParser.js. So the first thing I did was write a test to reproduce this error.


describe('and the series and episode number are in 3 digit format', function(){
    it('should return the information for the series and episode correctly', function(){
        var result = nameParser.getShowDetails('cool_show_102');
        result.seriesNumber.must.be(1);
        result.episodeNumber.must.be(2);
    });
});

When I run the tests after adding this unit test I get the same error so I know I have reproduced the defect. Now I have a failing test that once I get passing will mean the defect will never come back. Also note how easy this was for me to track down the defect because the code is decoupled and already heavily tested.

Once the test is written all that is left is to get it to pass by adding another regex to the nameParser to detect 3 consecutive numbers. Once this was passing I ran the renamer again on the list of files and found a different error. Some of the files were in this format:

cool_show_S01E07_encoding_x264.mp4

This caused an issue because 3 consecutive numbers were being picked up by the nameParser and because that regex was running before the one looking for seasons and episodes prefixed with ‘S’ and ‘E’ it was incorrectly giving season 2 and episode 64 for the above filename.

So to fix again using TDD this is very simple. Add another test reproducing the defect:


describe('and the series and episode number are specified and a 3 digit number appears after them', function(){
    it('should return the information for the series and episode correctly', function(){
        var result = nameParser.getShowDetails('cool_show_S01E02_cool_x264');
        result.seriesNumber.must.be(1);
        result.episodeNumber.must.be(2);
    });
});

To fix all I had to do was simply rearrange the order the regex statements run in the nameParser. 39 tests now passing. I then ran the series renamer on the files in question and it worked perfectly.

The lesson here is that by building software using a TDD approach it forces you to write decoupled components each with their own job. Which means that it’s easy to isolate defects when they occur and write tests that reproduce the defects. Once the test is passing that defect cannot reoccur without failing a test.

If you want to checkout the full code for the series renamer you can clone it on github.

TV Series Renamer Written in Node

I found myself in the situation the other day where I had a season of a TV show that needed renaming sequentially to clean up the file names.  The files were located in sub folders making it quite a laborious manual task to clean this up.  Step forward a node coding challenge.

I wrote the renamer using a fully TDD practice in node.  The finished program is blazingly fast.  The whole set of unit tests (37 at time of writing) including a test which creates dummy files, moves them, checks them and then cleans everything up (integration test) takes 51ms!

I used mocha for my unit tests.  Mocha using the same syntax as Jasmine (describe and it) giving a nice clean syntax.  There are a few extra features you get with mocha out of the box that you don’t get with Jasmine making it my preferred unit testing framework.  Before anyone shouts I know you can extend Jasmine to do the same thing but it’s nice just having all of these features there from the get go.

The first nice feature of mocha is being able to run a single unit test.  To do this simply change describe(‘when this happens…  to describe.only(‘when this happens…  By adding the ‘.only’ you are telling mocha to only run this describe block and any child tests of this block.  This can come in handy when you are working on getting one test passing on a big project and you dont want the overhead of running every test.  You can also use ‘.only’ on it statements which will only run a single test.

The next cool feature of mocha is the way that it handles testing asynchronous code.  Mocha gives you a done callback you can pass into a beforeEach or it statement.  You call done when you are done.  Mocha will automatically wait for done to be called before failing the test or time out if you never call done.

beforeEach(function(done) {
    seriesPath = __dirname + '/testSeries1/';
    fileFetcher.fetchFiles(seriesPath).then(function(data){
        result = data;
        done();
    });
});

The above code shows an example beforeEach block from the fileFetcher tests. The code fetches the files and in the ‘then’ handler it saves the result and then calls done. This is a very neat way of handling asynchronous testing.

Another awesome library that deserves a mention is Q. This library is renowned for being the best implementation of a promise library and makes dealing with asynchronous code much easier. Q allows you to store a promise when you call an asynchronous function rather like Task in .net. The promise allows the program to continue and then handle that result or error when it comes in.

There are two common ways that promises are handled. Either straight away by using a ‘then’ block or by storing the promise and then examining it later.

fileRenamer.generateRename('Great_Show', seriesPath, outputDir)
    .then(function(data){
        results = data;            
    });

var x = 11;

Above is an example of using a then handler. The function will be called by Q when the generateRename function returns something. The value that is returned by generateRename will be passed in to the data parameter in the anonymous function below. Note the line ‘x=11’ may execute before the line ‘results=data’. It is saying execute this then execute this.

The other way promises are normally used is by storing them and waiting on them later. This is especially useful if you have several tasks that can all run in parallel. You can set them all going storing the promise they give you back. When they all finish you can collect the results and continue on.

var promises = [];
promises.push(processDir(dir));

for(var i=0; i<files.length; i++){
    if (fs.lstatSync(dir + files[i]).isDirectory()){
        promises.push(processDir(dir + files[i]));
    }
}		

$q.all(promises).then(function(promiseResults){								
    deferred.resolve({episodes: _.flatten(promiseResults)});
});

The above code is a snippet from the fileFetcher.js file. processDir is being called upon each iteration of the for loop setting in motion a task to move a file. The promise returned by processDir is stored in an array of promises. Q gives us a method call ‘all’ which takes an array of promises and then you can use the same ‘then’ statement as shown above. This time the parameter passed into the handler will be an array with the results of all of the promises. The beauty is this array will be in the same order that you kicked them off in. Allowing you to marry up your results pretty cool!

So you might be wondering how to I return a promise from a function. Good question. You simply use the following pattern:

function Square(x)
{
    var deferred = $q.defer();
    deferred.resolve(x*x);
    return deferred.promise;
}

The function above will return a promise that will give you the result of squaring the parameter passed in. You would use it like this:

Square(3).then(function(result){
    console.log(result);
});

The last part of Q I want to talk about that is awesome is it’s ability to wrap up node functions and make them return promises. For example the rename file is fs.rename(oldPath, newPath, callback). Normally you would pass in a callback to call when the file is renamed. But then if you wanted to call another asynchronous function after and pass in another callback quickly your code would become a nested mess that would be hard to follow. Step up denodeify. Denodeify tasks a node function that calls a callback and makes it return a promise instead:

// instead of having a callback like this
fs.rename('test.txt', 'test2.txt', function(){
    console.log('done');
});

// you can wrap the function with q like this
var moveFile = $q.denodeify(fs.rename);

// then call it as you would any function that returns a promise
moveFile('test.txt', 'test2.txt').
    then(function(){ console.log('done'); });

This comes in especially handy when you are doing many operations in a loop. Like is being done in my fileRenamer.js file. I will leave the reader to examine the code below which combines all of the ideas explained thus far.


var performRename = function(showName, fromDir, toDir) {
    var deferred = $q.defer();

    generateRename(showName, fromDir, toDir)
        .then(function(files){				
            var promises = [];
            var moveFile = $q.denodeify(fs.rename);
            
			for(var i=0; i<files.length; i++){
                promises.push(moveFile(files[i].from, files[i].to));
            }
            $q.all(promises).then(function(){
                deferred.resolve(files);
            });
				
        });

    return deferred.promise;
};

If you want to read up more on q please see the documentation.

Please feel free to download the full source code of the series renamer from github. If you have any questions or comments feel free to give me a shout I will be happy to help.

BuilderFramework – Dependent steps

Last time I started to detail about a new open source builder framework that I was writing. Today I wanted to speak about dependent steps.

It is important for the builder to support dependent steps. For example you might have one step to create an order and another step to create a customer. Obviously the step to create the customer will need to run before the step to create the customer. When reversing these steps they will need to run in the opposite order.

To manage this I have written an attribute DependsOnAttribute. This attribute takes a type as it’s constructor parameter.  The attribute allows you to annotate which steps your step depends on. For example:


public class CreateCustomerStep : IBuildStep { ...

[DependsOn(typeof(CreateCustomerStep))]
public class CreateOrderStep : IBuildStep { ...

To support this the commit method needs to sort the steps into dependency order. It also needs to check for a circular dependency and throw an exception if one is found. We are going to need a separate class for managing the sorting of the steps (remember single responsibility!). The interface is detailed as follows:

public interface IBuildStepDependencySorter
{
    IEnumerable Sort(IEnumerable buildSteps);
}

Before we implement anything we need a set of tests to cover all of the use cases of the dependency sorter.  That way when all of the tests pass we know that our code is good. I always like to work in a TDD style.  (The tests I have come up with can be seen in depth on the github source page or by cloning the source).

At a high level these are the tests we need:

  • A simple case where we have 2 steps where one depends on the other
  • A simple circular reference with 3 steps throws an exception
  • A complex circular reference with 5 steps throws an exception
  • A more complex but valid 4 step dependency hierarchy gets sorted correctly
  • A multiple dependency (one step dependent or more than one other step) gets sorted correctly

It is so important to spend a decent amount of time writing meaningful tests that test all of the use cases of your code.  Once you have done this it makes it so much easier to write the code.  I see so many people writing the code first and then retro fitting the tests.  Some developers also claim that they haven’t got time to write tests.  I can’t see this logic.  When writing tests first your code is quicker to write as you know when the code is working.  If you write your tests last then you are just caught in a horrible debugger cycle trying to work out what’s going wrong and why.  You should rarely if ever need the debugger.

To implement dependency sorting we need to implement a topological sorted as detailed on wikipedia. I have decided to implement the algorithm first described by Khan (1962).

Here is the Psuedo code for the algorithm:

L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edges
while S is non-empty do
    remove a node n from S
    add n to tail of L
    for each node m with an edge e from n to m do
        remove edge e from the graph
        if m has no other incoming edges then
            insert m into S
if graph has edges then
    return error (graph has at least one cycle)
else
    return L (a topologically sorted order)

Here is that code in C#:

public class BuildStepDependencySorter : IBuildStepDependencySorter
{
    private class Node
    {
        public Node(IBuildStep buildStep)
        {
            BuildStep = buildStep;
            IncomingEdges = new List<Edge>();
            OutgoingEdges = new List<Edge>();
        }

        public IBuildStep BuildStep { get; private set; }
        public List<Edge> IncomingEdges { get; private set; }
        public List<Edge> OutgoingEdges { get; private set; }
    }

    private class Edge
    {
        public Edge(Node sourceNode, Node destinationNode)
        {
            SourceNode = sourceNode;
            DestinationNode = destinationNode;
        }

        public Node SourceNode { get; private set; }
        public Node DestinationNode { get; private set; }

        public void Remove()
        {
            SourceNode.OutgoingEdges.Remove(this);
            DestinationNode.IncomingEdges.Remove(this);
        }
    }

    public IEnumerable<IBuildStep> Sort(IEnumerable<IBuildStep> buildSteps)
    {
        List<Node> nodeGraph = buildSteps.Select(buildStep => new Node(buildStep)).ToList();

        foreach (var node in nodeGraph)
        {
            var depends = (DependsOnAttribute[])Attribute.GetCustomAttributes(node.BuildStep.GetType(), typeof(DependsOnAttribute));
            var dependNodes = nodeGraph.Where(n => depends.Any(d => d.DependedOnStep == n.BuildStep.GetType()));

            var edges = dependNodes.Select(n => new Edge(node, n)).ToArray();
            node.OutgoingEdges.AddRange(edges);

            foreach (var edge in edges)
                edge.DestinationNode.IncomingEdges.Add(edge);
        }

        var result = new Stack<Node>();
        var sourceNodes = new Stack<Node>(nodeGraph.Where(n => !n.IncomingEdges.Any()));
        while (sourceNodes.Count > 0)
        {
            var sourceNode = sourceNodes.Pop();
            result.Push(sourceNode);

            for (int i = sourceNode.OutgoingEdges.Count - 1; i >= 0; i--)
            {
                var edge = sourceNode.OutgoingEdges[i];
                edge.Remove();

                if (!edge.DestinationNode.IncomingEdges.Any())
                    sourceNodes.Push(edge.DestinationNode);
            }
        }

        if (nodeGraph.SelectMany(n => n.IncomingEdges).Any())
            throw new CircularDependencyException();

        return result.Select(n => n.BuildStep);
    }

}

Imagine how hard this code would’ve been to get right with no unit tests! When you have unit tests and nCrunch an indicator simply goes green when it works! If you haven’t seen or heard of nCrunch before definitely check that out. It is a fantastic tool.

Now that we have the dependency sorter in place all we need to do is add some more tests to the builder class.  These tests ensure that the steps are sorted into dependency order before they are committed and they are sorted in reverse dependency order when they are rolled back. With those tests in place it is quite trivial update the builder to sort the steps for commit and rollback (see snippet from builder below):

private void Execute(IEnumerable<IBuildStep> buildSteps, Action<IBuildStep> action)
{
    foreach (var buildStep in buildSteps)
    {
        action(buildStep);
    }
}

public void Commit()
{
    Execute(_buildStepDependencySorter.Sort(BuildSteps),
                buildStep => buildStep.Commit());
}

public void Rollback()
{
    Execute(_buildStepDependencySorter.Sort(BuildSteps).Reverse(),
                buildStep => buildStep.Rollback());
}

I love how clean that code is. When your code is short and to the point like this it is so much easier to read, maintain and test. That is the importance of following SOLID principles.

As always I welcome your feedback so feel free to tweet or email me.