SqlJuxt – Speeding up the tests

SqlJuxt is coming along nicely. I have a great test pack in the project that tests through the top level API. I have spoken before about the value of this (gives you freedom to change your implementation etc) so I won’t harp on about it again. The problem is that the tests take 2 minutes 40 seconds to run, clearly too long. Especially when you consider that number is going to keep on growing as I add more tests.

The first step to debugging why the tests were taking so long was to turn on timings inside the Resharper test runner. This gives you a break down of how long each test takes. From that dialog I could see that all of the tests that created 2 databases for comparison were taking around 6 seconds. I then put timing around each line in one of these tests and found that the whole test ran in around 0.3 seconds and the test clean up took the other 5.7 seconds. This lead me to realise that it was the dropping of the database that was causing the slowdown.

To prove this theory I commented out the drop database code (which if you remember is in the dispose method in the DisposableDatabase type). I ran the whole test pack again and it ran in 12 seconds. Wowzer! Clearly this is not a solution as each test run will leave behind lots of databases.

I noticed that after the tests have finished if you drop a database then it happens pretty much instantaneously, but it takes ages when it happens as the last step of the test. This lead me to realise that something in my test was hanging on to the database connection causing it to have to rip the connection away. I examined my test code and I definitely close down the SqlConnection so this left me stumped for a little while. Then I got on to reading about connection pooling. Then it twigged that ADO .Net turns on connection pooling by default. What this means is that when you close a sql connection the connection doesn’t get completely destroyed. Instead it gets put into a sleeping state. This is to speed up the process of getting a connection next time you want one. In nearly all applications this is what you want as you are going to be continually connecting to your database to make connections and you do not want to keep on going through the overhead of the handshake to set up a connection. However in my scenario this is not what I want. I am making a database connection, running in a script to setup a schema, comparing that schema and then dropping it again and at that point I no longer need the database. The solution to my issue was to simply turn off connection pooling. This means that when you close the connection it is completely gone so when you drop the database it drops pretty much instantly as it doesn’t have to rip away any existing connections. To turn off connection pooling you simply add “pooling=false;” to the connection string. I did that and ran the tests and all of the tests now run in 12 seconds. Quite an improvement, its like I found the turbo button!!

I just want to reiterate that I am not advocating turning off connection pooling for everyone. It is defaulted to on for a reason (its probably what you want), its just in my specialised scenario it was causing a massive performance overhead.

It is quite a win having found that as now I can run my whole integration test pack in 12 seconds it makes it much easier to refactor.

Check out the full source code at SqlJuxt GitHub repository.

SqlJuxt – Active patterns for the win

F# Active Patterns are awesome!! I wanted to start this blog post with that statement. I was not truly aware of F# active patterns until I read the article on F Sharp for Fun and Profit when I was looking at a better way to use the .Net Int32.Parse function.

Active patterns are a function that you can define that can match on an expression. For example:

let (|Integer|_|) (s:string) = 
    match Int32.TryParse(s) with
        | (true, i) -> Some i
        | _ -> None 

To explain what each line is doing the | characters are called banana clips (not sure why) and here we are defining a partial active pattern. This means the pattern has to return Option of some type. So be definition the pattern may return a value or it may not so it is a partial match. This pattern takes a string and then returns Some int if the pattern matches or None if it does not. Once this pattern is defined it can be used in a normal match expression as follows:

let printNumber (str:string) =
	match str with
		| Integer i -> printfn "Integer: %d" i
		| _ -> "%s is not a number" %s

Using this technique it allows us to define a really cool active pattern for matching regular expressions and parsing out the groups that are matched:

let (|ParseRegex|_|) regex str =
    let m = Regex(regex).Match(str)
    match m with 
        | m when m.Success -> Some (List.tail [ for x in m.Groups -> x.Value] )
        | _ -> None

This regex active pattern returns all of the matched groups if the regex is a match or None if it is not a match. Note the reason List.tail is used is to skip the first element as that is the fully matched string which we don’t want, we only want the groups.

The reason why all of this came up is that the getNextAvailableName function that I wrote about in my last blog post is very long winded. For those who haven’t read my last post this function generates a unique name by taking a candidate name and a list of names that have been taken. If the name passed in has been taken then the function will add a number on to the end of the name and keep incrementing it until it finds a name that is not taken. The getNextAvailableName function was defined as:

let rec getNextAvailableName (name:string) (names: string list) =

    let getNumber (chr:char) =
        match Int32.TryParse(chr.ToString()) with
            | (true, i) -> Some i
            | _ -> None

    let grabLastChar (str:string) =
        str.[str.Length-1]

    let pruneLastChar (str:string) =
        str.Substring(0, str.Length - 1)

    let pruneNumber (str:string) i =
        str.Substring(0, str.Length - i.ToString().Length)

    let getNumberFromEndOfString (s:string)  =

        let rec getNumberFromEndOfStringInner (s1:string) (n: int option) =
            match s1 |> String.IsNullOrWhiteSpace with
                | true -> n
                | false -> match s1 |> grabLastChar |> getNumber with
                            | None -> n
                            | Some m ->  let newS = s1 |> pruneLastChar
                                         match n with 
                                            | Some n1 -> let newN = m.ToString() + n1.ToString() |> Convert.ToInt32 |> Some
                                                         getNumberFromEndOfStringInner newS newN
                                            | None -> getNumberFromEndOfStringInner newS (Some m) 
        let num = getNumberFromEndOfStringInner s None
        match num with
            | Some num' -> (s |> pruneNumber <| num', num)
            | None -> (s, num)
        

    let result = names |> List.tryFind(fun x -> x = name)
    match result with
        | Some r -> let (n, r) = getNumberFromEndOfString name
                    match r with 
                        | Some r' -> getNextAvailableName (n + (r'+1).ToString()) names
                        | None -> getNextAvailableName (n + "2") names
                    
        | None -> name

With the Integer and Regex active patterns defined as explained above the new version of the getNextAvailableName function is:

let rec getNextAvailableName (name:string) (names: string list) =

    let result = names |> List.tryFind(fun x -> x = name)
    match result with
        | Some r -> let (n, r) = match name with
                                    | ParseRegex "(.*)(\d+)$" [s; Integer i] -> (s, Some i)
                                    | _ -> (name, None)
                    match r with 
                        | Some r' -> getNextAvailableName (n + (r'+1).ToString()) names
                        | None -> getNextAvailableName (n + "2") names
                    
        | None -> name

I think it is pretty incredible how much simpler this version of the function is. It does exactly the same job (all of my tests still pass:) ). It really shows the power of active patterns and how they simplify your code. I think they also make the code more readable as even if you didn’t know the definition of the ParseRegex active pattern you could guess from the code.

Check out the full source code at SqlJuxt GitHub repository.

SqlJuxt – Making index names unique

As part of the fluent database builder for SqlJuxt I automatically generate names for objects that you create on the database such as primary keys and indexes. For primary keys this is an easy task as you can only have one primary key on a table so I can simply use the string “PK_

<

table>”. As a table name has to be unique I’m assured that the primary key name will be unique.

The problem comes when you want to create unique names for indexes. On Sql Server it is legal to have more than one non clustered index on a table on the same columns. So I am left with two options I either get the user to specify the index name which feels a bit clunky and unnecessary or I have to check for duplicate names and generate unique names. I chose the latter approach.

The naming convention for the indexes I have gone with is “IDX_

<

table>_” where column_names is a list of column names separated by underscores.

To keep the names unique I decided that if there was already an index with the name that is generated using the formula above then I would append a number on the end. If that name was taken then I would increment the number until I found a name that wasn’t taken.

To do the work of finding a unique name I thought it was best to abstract this out into its own function that could work with any name. The function signature I came up with was:

getNextAvailableName: string -> string list -> string

The function takes a string which is the name and a list of string which are the names that have been taken it then gives you back a new string which will be unique.

To write this function I needed a set of tests (test first remember) to prove out my test cases, these are:

[<Test>]
let ``should return name passed in when name is not in collection as collection is empty``() =
    getNextAvailableName "my_index" []
        |> should equal "my_index"

[<Test>]
let ``should return name passed in when name is not in collection``() =
    getNextAvailableName "my_index" ["some_index"; "some_other"]
        |> should equal "my_index"

[<Test>]
let ``should return my_index2 when my_index is in collection``() =
    getNextAvailableName "my_index" ["my_index"]
        |> should equal "my_index2"


[<Test>]
let ``should return my_index3 when my_index and my_index2 are in collection``() =
    getNextAvailableName "my_index" ["my_index"; "my_index2"]
        |> should equal "my_index3"

[<Test>]
let ``should return my_index3 when my_index and my_index2 and my_index33 are in collection``() =
    getNextAvailableName "my_index" ["my_index"; "my_index2"; "my_index33"]
        |> should equal "my_index3"

[<Test>]
let ``should return my_index2 when my_index and my_index22 are in collection``() =
    getNextAvailableName "my_index" ["my_index"; "my_index22"]
        |> should equal "my_index2"

[<Test>]
let ``should return my_index10 when my_index 2-9 are already in collection``() =
    getNextAvailableName "my_index" ["my_index"; "my_index2"; "my_index3"; "my_index4"; "my_index5"; "my_index6"; "my_index7"; "my_index8"; "my_index9"]
        |> should equal "my_index10"

I love how readable tests are in F# when you use FsUnit!! Now we have our tests defined we can go ahead an implement the function. I am sure that I didn’t do this in the most functional and efficient way. If anyone could help tidy this up then I would greatly appreciate it. My implementation is:

let rec getNextAvailableName (name:string) (names: string list) =

        let getNumber (chr:char) =
            match Int32.TryParse(chr.ToString()) with
                | (true, i) -> Some i
                | _ -> None

        let grabLastChar (str:string) =
            str.[str.Length-1]

        let pruneLastChar (str:string) =
            str.Substring(0, str.Length - 1)

        let pruneNumber (str:string) i =
            str.Substring(0, str.Length - i.ToString().Length)

        let getNumberFromEndOfString (s:string)  =

            let rec getNumberFromEndOfStringInner (s1:string) (n: int option) =
                match s1 |> String.IsNullOrWhiteSpace with
                    | true -> n
                    | false -> match s1 |> grabLastChar |> getNumber with
                                | None -> n
                                | Some m ->  let newS = s1 |> pruneLastChar
                                             match n with 
                                                | Some n1 -> let newN = m.ToString() + n1.ToString() |> Convert.ToInt32 |> Some
                                                             getNumberFromEndOfStringInner newS newN
                                                | None -> getNumberFromEndOfStringInner newS (Some m) 
            let num = getNumberFromEndOfStringInner s None
            match num with
                | Some num' -> (s |> pruneNumber <| num', num)
                | None -> (s, num)
            

        let result = names |> List.tryFind(fun x -> x = name)
        match result with
            | Some r -> let (n, r) = getNumberFromEndOfString name
                        match r with 
                            | Some r' -> getNextAvailableName (n + (r'+1).ToString()) names
                            | None -> getNextAvailableName (n + "2") names
                        
            | None -> name

I’m sure there are some tricks you can do with pattern matching to shorten this down. Now that we have this function and all of the tests pass it is trivial to plug it in to our database builder. All we have to do is generate the index name using the formula above and then call the getNextAvailableName function with the generated index name and a list of all of the index names on the table. The function will then give us back a unique name to use. This gets proved out by the following test:

[<Test>]
let ``should name indexes sequentially when there are multiple indexes defined that would generate the same name``() =
    CreateTable "MyIndexedTable"
        |> WithInt "MyKeyColumn"
        |> WithInt "SecondKeyColumn"
        |> WithNonClusteredIndex UNIQUE [("MyKeyColumn", ASC); ("SecondKeyColumn", DESC)]
        |> WithNonClusteredIndex UNIQUE [("MyKeyColumn", ASC); ("SecondKeyColumn", DESC)]
        |> WithNonClusteredIndex NONUNIQUE [("MyKeyColumn", ASC) ; ("SecondKeyColumn", DESC)]
        |> ScriptTable
        |> should equal @"CREATE TABLE [dbo].[MyIndexedTable]( [MyKeyColumn] [int] NOT NULL, [SecondKeyColumn] [int] NOT NULL )
GO

CREATE UNIQUE NONCLUSTERED INDEX IDX_MyIndexedTable_MyKeyColumn_SecondKeyColumn ON [dbo].[MyIndexedTable] ([MyKeyColumn] ASC, [SecondKeyColumn] DESC)
GO


CREATE UNIQUE NONCLUSTERED INDEX IDX_MyIndexedTable_MyKeyColumn_SecondKeyColumn2 ON [dbo].[MyIndexedTable] ([MyKeyColumn] ASC, [SecondKeyColumn] DESC)
GO


CREATE NONCLUSTERED INDEX IDX_MyIndexedTable_MyKeyColumn_SecondKeyColumn3 ON [dbo].[MyIndexedTable] ([MyKeyColumn] ASC, [SecondKeyColumn] DESC)
GO"

We can see how the generated index names are the same so they have been numbered.

Check out the full source code at SqlJuxt GitHub repository.

SqlJuxt – Defining indexes on a table

I have just finished implementing the first implementation of table indexes. Both scripting out an index on the table using the fluent builder syntax and comparing indexes on tables. When writing this feature I had some interesting design decisions to make…

My first design for the type to represent index is shown below:

type Constraint = {name: string; columns: (Column * SortDirection)  list; clustering: Clustering}
type Index = {name: string; columns: (Column * SortDirection) list; clustering: Clustering; uniqueness: Uniqueness}

type Table = {schema: string; name: string; columns: Column list; primaryKey: Constraint option; indexes: Index list}

As you can see the primary key on the table is defined as Constraint of Option and the indexes are defined as Index list. When I started writing the code to use these types I noticed a lot of duplication. Then I realised that an index and a primary key are both really constraints just with slightly different properties. Those being that a primary key is always unique that’s what makes it a key!

I decided to extend the Constraint type by adding the uniqueness property to it. Then it was a simple job of extending the primary key methods to always set the uniqueness to unique. Now the type for a table looks like:

type Table = {schema: string; name: string; columns: Column list; primaryKey: Constraint option; indexes: Constraint list}

So a table has a list of indexes which could of course be empty and it may or may not have a primary key which we can represent by using a Constraint of option. The other advantage of modelling both primary keys and indexes using the constraint type is that we can select them out of the database when loading up to build the schema for comparison all at the same time. We simply have to extend the query to bring back the additional information of whether the constraint is unique and if it is a primary key or not so we know whether to put it on the primary key property of the table.

I did toy with the idea of having all constraints in a list on the table called constraints. That list would’ve the primary key if there was one and all of the indexes for the table. I decided against that approach as it feels a bit clunky to have to go through a list to find the primary key of a table. Also the reason I didn’t like that approach is that where possible you should use the type system and thus the compiler to enforce correctness in your program and make it impossible to model illegal state. If I had a list of constraints on a table I would have to manually check to make sure there was only one primary key in the list. Whereas if I have the primary key defined on the table as option of Constraint then there can only ever be one primary key.

If you want to check out the full source code and delve deeper feel free to check out the SqlJuxt GitHub repository.

XBehave – compiling the test model using the Razor Engine

In the last post I left off describing how I implemented the parsing of the XUnit console runner xml in the XUnit.Reporter console app. In this post I want to talk through how you can take advantage of the excellent RazorEngine to render the model to html.

I am going to talk through the console app line by line. The first line of the app:

var reporterArgs = Args.Parse<ReporterArgs>(args);

Here we are using the excellent PowerArgs library to parse the command line arguments out into a model. I love how the API for PowerArgs has been designed. It has a slew of features which I won’t go into here for example it supports prompting for missing arguments all out of the box.

Engine.Razor.Compile(AssemblyResource.InThisAssembly("TestView.cshtml").GetText(), "testTemplate", typeof(TestPageModel));

This line uses the RazorEngine to compile the view containing my html, it gives the view the key name “testTemplate” in the RazorEngine’s internal cache. What is neat about this is that we can deploy the TestView.cshtml as an embedded resource so it becomes part of our assembly. We can then use the AssemblyResource class to grab the text from the embedded resource to pass to the razor engine.

var model = TestPageModelBuilder.Create()
                                .WithPageTitle(reporterArgs.PageTitle)
                                .WithTestXmlFromPath(reporterArgs.Xml)
                                .Build();

We then create the TestPageModel using the TestPageModelBuilder. Using a builder here gives you very readable code. Inside the builder we are using the XmlParser from earlier to parse the xml and generate the List of TestAssemblyModels. We also take the optional PageTitle argument here to place in the page title in our view template.

var output = new StringWriter();
Engine.Razor.Run("testTemplate", output, typeof(TestPageModel), model);
File.WriteAllText(reporterArgs.Html, output.ToString());

The last 3 lines simply create a StringWriter which is used by the engine to write the compiled template to. Calling Engine.Razor.Run runs the template we compiled earlier using the key we set “testTemplate”. After this line fires our html will have been written to the StringWriter so all we have to do is extract it and then write it out to the html file path that was parsed in.

That’s all there is to it. We now have a neat way to extract the Given, When, Then gherkin syntax from our XBehave texts and export it to html in whatever shape we chose. From there you could post to an internal wiki or email the file to someone, that could all be done automatically as part of a CI build.

If anyone has any feedback on any of the code then that is always welcomed. Please check out the XUnit.Reporter repository for all of the source code.

XBehave – Exporting your Given, Then, When to html

For a project in my day job we have been using the excellent XBehave for our integration tests. I love XBehave in that it lets you use a Given, When, Then syntax over the top of XUnit. There are a couple of issues with XBehave that I have yet to find a neat solution for (unless I am missing something please tell me if this is the case). The issues are

1) There is not a nice way to extract the Given, When, Then gherkin syntax out of the C# assembly for reporting
2) The runner treats each step in the scenario as a separate test

To solve these to problems I am writing an open source parser that takes the xml produced by the XUnit console runner and parses it to a C# model. I can then use razor to render these models as html and spit out the resultant html.

This will mean that I can take the html and post it up to a wiki so every time a new build runs the tests it would be able to update the wiki with the latest set of tests that are in the code and even say whether the tests pass or fail. Allowing a business/product owner to review the tests, see which pieces of functionality are covered and which features have been completed.

To this end I have created the XUnit.Reporter github repository. This article will cover the parsing of the xml into a C# model.

A neat class that I am using inside the XUnit.Reporter is the AssemblyResource class. This class allows easy access to embedded assembly resources. Which means that I can run the XUnit console runner for a test, take the resultant output and add it to the test assembly as an embedded resource. I can then use the AssemblyResource class to load back the text from the xml file by using the following line of code:

AssemblyResource.InAssembly(typeof(ParserScenarios).Assembly, "singlepassingscenario.xml").GetText())

To produce the test xml files for the tests I simply set up a console app, added XBehave and then created a test in the state I wanted for example a single scenario that passes. I then ran the XUnit console runner with the -xml flag set to produce the xml output. I then copied the xml output to a test file and named it accordingly.

The statistics for the assembly model and test collection model are not aligned to what I think you would want from an XBehave test. For example if you have this single XBehave test:


public class MyTest
{
[Scenario]
public void MyScenario()
{
"Given something"
._(() => { });

"When something"
._(() => { });

"Then something should be true"
._(() => { });

"And then another thing"
._(() => { });
}
}

Then the resultant xml produced by the console runner is:

<?xml version="1.0" encoding="utf-8"?>
<assemblies>
<assembly name="C:\projects\CommandScratchpad\CommandScratchpad\bin\Debug\CommandScratchpad.EXE" environment="64-bit .NET 4.0.30319.42000 [collection-per-class, parallel (2 threads)]" test-framework="xUnit.net 2.1.0.3179" run-date="2017-01-27" run-time="17:16:00" config-file="C:\projects\CommandScratchpad\CommandScratchpad\bin\Debug\CommandScratchpad.exe.config" total="4" passed="4" failed="0" skipped="0" time="0.161" errors="0">
<errors />
<collection total="4" passed="4" failed="0" skipped="0" name="Test collection for RandomNamespace.MyTest" time="0.010">
<test name="RandomNamespace.MyTest.MyScenario() [01] Given something" type="RandomNamespace.MyTest" method="MyScenario" time="0.0023842" result="Pass" />
<test name="RandomNamespace.MyTest.MyScenario() [02] When something" type="RandomNamespace.MyTest" method="MyScenario" time="0.0000648" result="Pass" />
<test name="RandomNamespace.MyTest.MyScenario() [03] Then something should be true" type="RandomNamespace.MyTest" method="MyScenario" time="0.0000365" result="Pass" />
<test name="RandomNamespace.MyTest.MyScenario() [04] And then another thing" type="RandomNamespace.MyTest" method="MyScenario" time="0.000032" result="Pass" />
</collection>
</assembly>
</assemblies>

If you look carefully at the xml you will notice a number of things which are counter-intuative. Firstly look at the total in the assembly element it says 4, when we only had a single test. This is because the runner is considering each step to be a separate test. The same goes for the other totals and the totals in the collection element. The next thing that you will notice is that the step names in the original test have had a load of junk added to the front of them.

In the parser I produce a model with the results that I would expect. So for the above xml it I produce an assembly model with a total of 1 test, 1 passed, 0 failed, 0 skipped and 0 errors. Which I think makes much more sense for XBehave tests.

Feel free to clone the repository and look through the parsing code (warning it is a big ugly). Next time I will be talking through the remainder of this app which is rendering the test results to html using razor.

SqlJuxt – Using partial function application to improve the API

After I got all of the tests passing for creating and comparing primary keys on a table I looked back at the code and decided that it had a bit of a smell to it.  Consider the following code snippet:

let private withPrimaryKey columns table isClustered =
    let cs = getColumnsByNames columns table
    {table with primaryKey = Some {name = sprintf "PK_%s" table.name; columns = cs; isClustered = isClustered}}
            
let WithClusteredPrimaryKey columns table =
    withPrimaryKey columns table true

let WithNonClusteredPrimaryKey columns table =
    withPrimaryKey columns table false

The isClustered parameter is of type bool. This in itself does not feel quite right. You can see what I mean when you look at the implementation of WithClusteredPrimaryKey or WithNonClusteredPrimaryKey. The line reads “withPrimaryKey columns table false”. It is obvious what all of those parameters are except the bool at the end. What does false mean?

Clearly this would be much better if it was self describing which we can easily do in F# using a discriminate union. By defining one like so:

type Clustering = CLUSTERED | NONCLUSTERED

The other part that was causing the code to smell but was perhaps less obvious was the order of the parameters. As described by the excellent post on F# for fun and profit on partial function application the order of your parameters is very important. In the builder functions shown above table is always placed last this means you do not need to mention it when using the script builder API, for example:

let rightTable = CreateTable "DifferentKeyTable"
                        |> WithInt "Column1"
                        |> WithInt "Column2" 
                        |> WithInt "Column3" 
                        |> WithClusteredPrimaryKey [("Column1", ASC); ("Column2", ASC)]
                        |> Build 

Notice how the table parameter does not need to be explicitly passed around. You would have to pass this if it was not the last parameter.

So we know that the order is important if we look at the with primary key functions we can see that the isClustered parameter being last stops us from using partial application so when we define the two methods to create a non clustered and clustered primary key we have to explicitly pass all of the parameters.

Here is the redesigned code of the WithPrimaryKey methods on the TableBuilder API:

let WithPrimaryKey clustering columns table =
    let cs = getColumnsByNames columns table
    {table with primaryKey = Some {name = sprintf "PK_%s" table.name; columns = cs; Clustering = clustering}}
              
let WithClusteredPrimaryKey = WithPrimaryKey CLUSTERED
let WithNonClusteredPrimaryKey = WithPrimaryKey NONCLUSTERED

Notice how much cleaner these three methods are now. By moving the clustering parameter to the start and using a discriminate union instead of a bool we have achieved two things. Firstly, the code is now self documenting as we are not passing true or false but instead passing CLUSTERED or NONCLUSTERED. Secondly because the clustering parameter is now first we can use the magic of partial function application to define WithClusteredPrimaryKey and WithNonClusteredPrimaryKey. Those two methods simply bake in whether the key is clustered or not and then leave you to fill in the rest of the parameters.

I really love how F# allows you to write beautiful code like this. I’m still learning to write functional code and am really enjoying the experience. Any feedback comments are welcome so please keep them coming.

If you want to check out the full source code and delve deeper feel free to check out the SqlJuxt GitHub repository.