Curing Trust Issues with Benchmarking

“You can trust me. I’m sure it works this way. This approach looks more like it.” - Never trust these phrases. Keep in mind that programming is based on science. It became cool and trendy over the last years, but still hard facts play the main role. Thank God, for that.
In this article I want to talk about a way to find out which solution is better. I’m talking about benchmarking - With this technique we can compare the number of approaches and see how they behave performance wise.

The “test” case

In this scenario I want to test two methods that are source of discussion and some people strongly prefer one over the other. I’m talking about LINQ methods FirstOrDefault and SingleOrDefault. Each of these methods is to return one element from the collection or some default value if the element can’t be found. They do it in a different fashion. SingleOrDefault description says: “Returns the only element of a sequence that satisfies a specified condition or a default value if no such element exists; this method throws an exception if more than one element satisfies the condition” and the FirstOrDefault is as follows: “Returns the first element of the sequence that satisfies a condition or a default value if no such element is found”. The main difference is that this one makes sure that there is only one element and the other one simply returns the first result matching the predicate.

I’m not going to tell you that you should use one over the other, because there are many factors in play while making such a decision. At least there should be. What I’m going to do is show you how to measure the performance of this two. Because the performance always should be a factor.

The application

When it comes to the application, I want to keep it simple. I have created a .net6 c# console application. In this application we will be querying for a people. The Person.cs class looks like this:

1
2
3
4
5
6
7
public class Person
{
    public int Id { get; set; }
    public string Name { get; set; } = String.Empty;
    public bool? Male { get; set; }
    public int YearOfBirth { get; set; }
}

It is a straightforward class with four properties. Nothing fancy.
The more interesting code is located in the PersonLogic.cs file. In this file we have a logic that is going the be used during the benchmarking.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public class PersonLogic
{
    private readonly List<Person> _people;
    public PersonLogic()
    {
        var personBogus = new Faker<Person>()
            .RuleFor(x => x.Id, f => f.IndexFaker)
            .RuleFor(x => x.Name, f => f.Person.FullName)
            .RuleFor(x => x.Male, f => f.Random.Bool(0.5f))
            .RuleFor(x => x.YearOfBirth, f => f.Random.Int(1950, 2020));
        _people = personBogus.Generate(25000);
    }

    public Person? SingleOrDefaultById(int id) => _people.SingleOrDefault(x => x.Id == id);
    public Person? FirstOrDefaultById(int id) => _people.FirstOrDefault(x => x.Id == id);
}

As we can see, the data is being mocked in the constructor. This is because database part is outside the scope of this article. To create the mock data, I have used an excellent solution called Bogus. If you haven’t heard about it check it out at https://github.com/bchavez/Bogus. It is my go-to solution in such situations.
Below the constructor there are two methods SingleOrDefaultById and FirstOrDefaultById. Both look for one object based on unique Id. But one uses Single and the other uses First. Other that that the methods are the same. Now let’s see how to benchmark them and what could be the results.

The benchmarking

First, we need to know what benchmarking is. If we look at the Wikipedia definition, we will learn that: “In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, to assess the relative performance of an object, normally by running a number of standard tests and trials against it.
The term benchmark is also commonly utilized for the purposes of elaborately designed benchmarking programs themselves.
Benchmarking is usually associated with assessing performance characteristics of computer hardware, for example, the floating-point operation performance of a CPU, but there are circumstances when the technique is also applicable to software. Software benchmarks are, for example, run against compilers or database management systems (DBMS).
Benchmarks provide a method of comparing the performance of various subsystems across different chip/system architectures.”

The main takeaway in this case is that it helps us in measuring performance by running number of repetitive tasks. This is exactly what we are going to do. In the project we are going to use NuGet package called BenchmarkDotNet. If you want to read more about it, please do so at https://github.com/dotnet/BenchmarkDotNet. The benchmarking code will be placed in the PersonBenchmark.cs file. The code inside the file looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[MemoryDiagnoser(false)]
public class PersonBenchmark
{
    private readonly PersonLogic _personLogic = new();

    [Params(0,5000,10000,15000,20000,24999)] 
    public int Id { get; set; }

    [Benchmark]
    public Person First()
    {
        return _personLogic.FirstOrDefaultById(Id)!;
    }
    
    [Benchmark]
    public Person Single()
    {
        return _personLogic.SingleOrDefaultById(Id)!;
    }
}

Its construction in based around attributes. The Params attribute tells us what the property values for the benchmarking process will be. As you can see, we pick one every five thousands of person objects, to make it evenly distributed.
Below the properties are two methods marked with Benchmark attribute. That means they will be rn as part of the benchmark.
Now all we need to do is write the code to run the benchmark process. We will place it inside the Program.cs file.

BenchmarkRunner.Run<PersonBenchmark>();

The code is super simple. All we are doing is telling the runner to run class with Benchmark methods - n this case PersonBenchmark. If we run for benchmarking, the program must be in its release configuration. In my case the benchmarking process took around 3 minutes. At the end of it you will be presented with a table of results.

The results are here, and they are clear. In most cases First is much faster. If the object is at the start of the collection, we are seeing an enormous time difference. The values are similar near the end. Does it mean that the First method is better? Performance wise you can say so. But Single has validation for unwanted duplication built-in. So, the right answer is…. “It depends”.
What I wanted to do is to show you the method of checking the performance and not place one method over the other. I believe I succeed in doing that.

Summary

We have just defined what benchmarking is and how to use it to compare two or more approaches when checking the actual performance metrics. Plus, we just had a short trip to mock city – and both mocking and benchmarking are very useful skills. Hope you find this article interesting.
If you have any questions, please drop me a line at karol.rogowski@softwarehut.com.
Till next time. Keep coding.

Curing Trust Issues with Benchmarking

Share:

The “test” case

The application

The benchmarking

Summary