Using HierarchyID in Entity Framework

Back in September, was the .NET Conf, and as .NET developers, we received a lot of new tools to play with.

One of the things of which we officially received a new version was Entity Framework (6.3 version). Along with it, support for new data types appeared.

This article is about using the HierarchyID data type in entity framework, along with an example console application that uses Entity Framework 6.3. The blog entry at the outset briefly describes the HierarchyID and how to use it. Later, I will go to the essence of the article and describe how to use the HierarchyID data type from the entity framework.

Parent/Child Approach

Before going into details of using HierarchyID, let’s do a little recap of the most popular way of handling hierarchal data. This most commonly used method is the Parent/Child approach. I think every developer is familiar with it.

In this approach, the table refers to itself – in that every row has a reference to its parent. I created a sample database diagram with a hierarchy of some product categories to discuss disadvantages of the Parent/Child approach.

The image below shows a simple database diagram.

This approach presents some advantages:

It’s simple and obvious to handle
It’s very easy to add a child processes and change parents
Simple queries are easy to handle

However, some more complex queries get slightly more complicated. For example, receiving all products for a given category, or children only on the first level is easy. Real-world applications are more complicated than that, however.

Let’s imagine we need to get for example, a list of all products in all child categories of a specific parent category. Getting more complicated data is not rocket science, but it gets more and more complicated, and becomes more resource-demanding.

Using the table referencing itself to create a tree is still a good method for storing hierarchal data. But it is worth knowing alternatives when building applications.

What is HierarchyID, Exactly?

HierarchyID is an SQL SERVER data type for storing positions in the hierarchy. It does only that, nothing more, nothing less. It allows you to create hierarchical structures like trees in a database, but nothing is created automatically, and it`s up to the programmer to create and keep data consistent.

Below, you can see an example visual representation of the tree build with HierarchyID.

Best way to describe it, is by using examples. In this case, we’ll use the structure categories in an online shop. Every category can have an unlimited number of subcategories that can proceed over a few levels. We want to be able to travel with relative ease through this tree. For this scenario, I have created an example table containing an HierarchyID column:

1
2
3
4
CREATE TABLE [dbo].[Category]( [Id] [int]
      IDENTITY(1,1) NOT NULL, [Name] [nvarchar](256) NOT NULL, [Level] [hierarchyid] NULL, CONSTRAINT
      [PK_ProductCategoryHId] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]

As the HierarchyID itself will not ensure uniqueness of the data, the application instead must ensure that. Following the code will be executed successfully on the SQL server.

1
2
3
INSERT dbo.[Category] ([Name], [Level])
      VALUES (N'First Categoiry', N'/1/') INSERT dbo.[Category] ([Name], [Level]) VALUES (N'Second Category',
      N'/1/')

It is therefore possible to create a tree appearing as in the image below.

Sometimes, having duplicates of a position in the tree is a desired practice. But in most cases, trees like this make no sense. By default, HierarchyID does not check if data is unique.

To prevent this situation, in addition to watching unique data from the application level, we can set unique constraints on to the column. In this case, the decision must always be thought out. Sometimes, the ability to duplicate a hierarchy can be useful. Below is the code for adding a unique constraint to the table:

1
2
ALTER TABLE [dbo].[Category] ADD CONSTRAINT
      [LevelUnique] UNIQUE NONCLUSTERED ( [Level] ASC )

As previously mentioned, HierarchyID is a type representing positions in the hierarchy. Using it does not guarantee that the data in the table will take the structure of a tree. There is a possibility to enforce a tree structure in two simple steps:

Create a computed column with item ancestor
Create foreign key on computed column, referencing parent row

1
2
ALTER TABLE
      [Category] ADD ParentLevel as [Level].GetAncestor(1) PERSISTED REFERENCES [Category]([Level])

Of course, to be able to add nodes, the root node needs to exist in the table:

1
2
INSERT dbo.[Category] ([Name], [Level])
      VALUES (N'root', N'/')

HierarchyID methods

Before going to entity framework, let`s talk about hierarchyid from t-sql perspective. For that purpose I created brief discussion of few methods, with examples of their usage. For those examples I used this test data. It`s same tree as show in first image of this blog post.

1
2
3
4
5
6
7
8
9
INSERT dbo.[Category] ([Name], [Level])
      VALUES (N'First Main Categoiry', N'/1/') INSERT dbo.[Category] ([Name], [Level]) VALUES (N'Second Main Categoiry',
      N'/2/') INSERT dbo.[Category] ([Name], [Level]) VALUES (N'Third Main Categoiry', N'/3/') INSERT dbo.[Category]
      ([Name], [Level]) VALUES (N'First Subcategory', N'/1/1/') INSERT dbo.[Category] ([Name], [Level]) VALUES (N'Second
      Subcategory', N'/1/2/') INSERT dbo.[Category] ([Name], [Level]) VALUES (N'Third Subcategory', N'/1/3/') INSERT
      dbo.[Category] ([Name], [Level]) VALUES (N'First Subcategory', N'/3/1/') INSERT dbo.[Category] ([Name], [Level])
      VALUES (N'Subcategory', N'/3/1/1/') INSERT dbo.[Category] ([Name], [Level]) VALUES (N'Subcategory', N'/3/1/2/')
      INSERT dbo.[Category] ([Name], [Level]) VALUES (N'Subcategory', N'/3/1/3/') INSERT dbo.[Category] ([Name],
      [Level]) VALUES (N'Second Subcategory', N'/3/2/')

ToString

ToString() returns a string representation of HierarchyID. It provides data in a human-readable form. I used it in almost every example. Below is a simple example of usage:

1
2
SELECT [Name] ,[Level].ToString() FROM
      [dbo].[Category]

And as a result, we can see human-readable data:

Parse

Parse() converts string to the HierarchyID data type. Below is a simple example of its usage:

1
2
SELECT hierarchyid::Parse(
      '/3/1/3/').ToString()

And the result is obvious:

The Parse() method is useful when writing queries, so we can specify or pass as a variable desired ID.

1
2
select [Name] ,[Level].ToString() as
      [Level] from [TestHId2].[dbo].[Category] where [Level] = hierarchyid::Parse( '/3/1/3/')

And as a result, we get:

GetLevel

Returns an integer that represents how the depth node is in the tree. This is a simple query, getting all levels in the table.

1
2
select [Name] ,[Level].ToString() as
      [Level] ,[Level].GetLevel() as [GetLevel] from [dbo].[Category]

The result explains it best:

We can also use it to where the clause appears, for example taking only items in which the depth equals to ‘2’.

1
2
select [Name] ,[Level].ToString() as
      [Level] from [dbo].[Category] where [Level].GetLevel() = 2

GetAncestor

Returns the item nth parent. The sample query best describing it looks like this:

1
2
3
select [Name] ,[Level].ToString() as
      [Level] ,[Level].GetAncestor(1).ToString() as [LevelGetAncestor1] ,[Level].GetAncestor(2).ToString() as
      [LevelGetAncestor2] from [dbo].[Category]

The result explains it best:

We can also use it where the clause, for example, only takes items from first-level parents:

1
2
select [Name] ,[Level].ToString() as
      [Level] from [dbo].[Category] where [Level].GetAncestor(1) = hierarchyid::Parse( '/3/1/')

As result:

GetDescendant

This segment does what it describes; by returning the child node.

The desired method is taking two parameters; child1, and child2. When those parameters are null, it returns the first child of a parent. If those parameters are not nullable, the return value is greater than child1, and lower than child2.

This is very useful when adding new nodes to parents, because it can produce an ID of a next-added node. To get new a child ID added to a parent, we just need the last-added child. We’ll now pass-off the last known child as child1, and null as child2.

In this case, the method will return next child id.

The first example is showing a simple use of the GetDescend method with null parameters:

1
2
DECLARE @Parent hierarchyid SET @Parent =
      CAST('/999/' AS hierarchyid); select @Parent.GetDescendant(null , null).ToString()

The produced result looks like this:

In my second example, I passed two child parameters to the GetDescendant method.

1
2
3
4
DECLARE @Parent hierarchyid, @Child1
      hierarchyid, @Child2 hierarchyid; SET @Parent = CAST('/999/' AS hierarchyid); SET @Child1 = CAST('/999/1/' AS
      hierarchyid); SET @Child2 = CAST('/999/2/' AS hierarchyid); select @Parent.GetDescendant(@Child1,
      @Child2).ToString()

As expected, the result is HierarchyID between “/999/1/” and “/999/2/”:

The third example shows how to use GetDescendant to get a proper hierarchy ID when adding a new child to a parent. For the parent, I used the root of the tree ‘/’.

1
2
3
declare @root hierarchyid, @previusElement
      hierarchyid SET @root = hierarchyid::GetRoot ( ) SET @previusElement = CAST('/1/' AS hierarchyid); select
      @root.GetDescendant(@previusElement , null).ToString()

IsDescendantOf

This is the method of checking if the current node is a child of a given element. It’s most useful in WHERE Clauses. Its use is described in the example below:

1
2
3
declare @parent hierarchyid SET @parent =
      CAST('/3/1/' AS hierarchyid); select [Name] ,[Level].ToString() as [Level] from [dbo].[Category] where
      [Level].IsDescendantOf (@parent) = 1

Below is a table, showing on which Microsoft databases HierarchyID is available. The SQL server supports it from the 2008 version, however the good news is, that we have support on Azure with the “Azure SQL” database.

How to use HierarchyID with Entity Framework

To start, we need to add references to the Entity Framework in version 6.3. The best way of course is by getting in as a NuGet package.

1
2
Install-Package EntityFramework -Version
      6.3.0

If while using HierarchyID you will have a problem with the SQL server types, it is a good idea to install it from NuGet:

1
2
Install-Package Microsoft.SqlServer.Types
      -Version 14.0.1016.290

And after installing them, add assembly binding in application settings. In most cases it should resolve the problem.

1
2
3
4
5
<runtime> <assemblyBinding
      xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity
      name="Microsoft.SqlServer.Types" publicKeyToken="89845dcd8080cc91" culture="neutral"/> <bindingRedirect
      oldVersion="0.0.0.0-14.0.0.0" newVersion="14.0.0.0"/> </dependentAssembly> </assemblyBinding>
      </runtime>

For EF examples, I created a sample database. Based on the scheme below, what is interesting now is the Category table in the scheme’s ‘Hierarchy’. It just has an int-based primary key, name, and column with HierarchyID.

Create the Data Model

At the moment of writing this article, there is no way of automatically creating code first from the database, or the database from the model. To use HierarchyID, mapping must be added manually. There is a special CLR type “HierarchyID” for handling HierarchyID columns. Below is an example model:

1
2
3
[Table("Category", Schema = "Hierarchy")]
      public partial class HierarchyCategory { public int Id { get; set; } [Required] [StringLength(256)] public string
      Name { get; set; } public HierarchyId Level { get; set; } }

Using LINQ for creating queries is supported. All HeirarchyID methods are available via LINQ. Below is an example of a query asking for items in the first level:

1
2
var topLevelQuery =
      Context.HierarchyCategory.Where(x => x.Level.GetLevel() == 1);

SQL created by the entity framework based on this query, looks like this:

1
2
3
SELECT [Extent1].[Id] AS [Id],
      [Extent1].[Name] AS [Name], [Extent1].[Level] AS [Level] FROM [Hierarchy].[Category] AS [Extent1] WHERE 1 =
      ([Extent1].[Level].GetLevel())

This is of course not the only method of getting first-level items by LINQ. We can simply ask for every first ancestor of an item and compare it to its root folder.

The second example below does basically the same thing as the first example.

1
2
var root = HierarchyId.GetRoot(); var
      topLevelQuery = Context.HierarchyCategory .Where(x => x.Level.GetAncestor(1) == root);

SQL created by the entity framework based on this query, looks like this:

1
2
3
4
SELECT [Extent1].[Id] AS [Id],
      [Extent1].[Name] AS [Name], [Extent1].[Level] AS [Level] FROM [Hierarchy].[Category] AS [Extent1] WHERE
      (([Extent1].[Level].GetAncestor(1)) = @p__linq__0) OR (([Extent1].[Level].GetAncestor(1) IS NULL) AND (@p__linq__0
      IS NULL))

Adding unique items is more complicated. HierarchyID does not have any kind of auto incrimination feature. It is not something like an auto-incremented primary key. We cannot accept any auto-generated values from the database, nor does it support handling duplicates. So, making sure that new values are correct, concurrency scenarios, duplicate checks and checks on others are on the application side.

The first example can be viewed in the sample application. In this code, a new node is added by:

Getting node parent
Getting latest child node of parent
Executing GetDescendent on parent node, passing last child as first argument

A more detailed explanation of GetDescendent method is explained in this article, in the section about HierarchyID in the SQL server.

The downside of this method is that it does not support any concurrency scenarios. They can be handled in one of the following methods:

Using serialisable transaction
Retry policy on unique key violation

1
2
3
4
5
6
7
8
public void Add(ProductCategoryDto add,
      int? parentId) { var entity = Context.HierarchyCategory.Create(); entity.Name = add.Name; HierarchyId parentItem;
      HierarchyCategory lastItemInCurrentLevel; if (!parentId.HasValue) { parentItem = HierarchyId.GetRoot(); } else {
      parentItem = Context.HierarchyCategory.FirstOrDefault(x => x.Id == parentId).Level; } lastItemInCurrentLevel =
      Context.HierarchyCategory .Where(x => x.Level.GetAncestor(1) == parentItem) .OrderByDescending(x => x.Level)
      .FirstOrDefault(); var child1Level = lastItemInCurrentLevel != null ? lastItemInCurrentLevel.Level : null; var
      newLevel = parentItem.GetDescendant(child1Level, null); entity.Level = newLevel;
      Context.HierarchyCategory.Add(entity); Context.SaveChanges(); }

There is no rule stating that numbers in nodes must be sequential. So alternative method is to just depend on table’s primary integer key, when creating nodes. It is possible but I would not recommend that solution. See the example below:

1
2
3
4
5
6
public void Add(ProductCategoryDto add,
      int? parentId) { var entity = Context.HierarchyCategory.Create(); entity.Name = add.Name;
      Context.HierarchyCategory.Add(entity); Context.SaveChanges(); if (!parentId.HasValue) { entity.Level =
      HierarchyId.Parse(string.Format("/{0}/", entity.Id)); } else { var parentEntity = Context.HierarchyCategory
      .FirstOrDefault(x => x.Id == parentId); entity.Level = HierarchyId.Parse( string.Format("{0}{1}/",
      parentEntity.Level.ToString(), entity.Id)); Context.SaveChanges(); } }