Monday, 9 December 2013

Shortcomings of Entity Framework

The first time I used Entity Framework, I was blown away by how simple it was to use, and how quickly I could get up and running. Until recently, most of my work has revolved around NHibernate, which in comparison is far more complex. For some reason, it did not occur to me that this simplicity would bring with it some inflexibility.

One issue that has bitten me recently is the lack of an equivalent to IUserType. NHibernate understands how to map a simple type from a database field to a property in an entity, but what if it is a more complex type stored in an XML column, or what if the data is coming from a web service? The way to achieve this in NHibernate is explained very well in this post.

This is a useful feature (that should be used sparingly) that I just assumed would be available in Entity Framework. Strictly speaking it isn't, although there is a work around.

Take the following objects:
public class Parent
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    public virtual Child Child {get;set;}
}

public class Child
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    // And other stuff...
}
I want to persist the Parent in a row in a SQL Server table, and persist the Child in a an XML column in that row:
CREATE TABLE [Parent](
[Id] [uniqueidentifier] NOT NULL PRIMARY_KEY,
[Description] [nvarchar](50) NULL,
[Child] [xml] NULL
This would have been possible in NHibernate with IUserType, but with Entity Framework we have to do things differently.

Firstly, unfortunately, this impacts on our domain model - never a good thing for an O/RM to inflict. It will have to look like this:
public class Parent
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    public virtual string ChildXml {get;set;}
    public virtual Child Child {get;set;}
}

public class Child
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    // And other stuff...
}
And I have renamed the column in the database:
CREATE TABLE [Parent](
[Id] [uniqueidentifier] NOT NULL PRIMARY_KEY,
[Description] [nvarchar](50) NULL,
[ChildXml] [xml] NULL
)
Now, in out class that overrides DbContext, we need to intercept the creation and saving of this entity. Intercepting the creation is done by handling the ObjectMaterialized event of the ObjectContext, and in here we construct our child entity from the xml:
public class Context : DbContext
{
    public Context()
    {
         //...
         ObjectContext.ObjectMaterialized += new ObjectMaterializedEventhandler(ObjectContext_ObjectMaterialized);
    }

    //...

    public ObjectContext ObjectContext
    {
        get { return ((IObjectContextAdapter)this).ObjectContext; }
    }

    public void ObjectContext_ObjectMaterialized(object sender, ObjectMaterializedEventArgs e)
    {
        var parent = e.Entity as Parent;

        if (parent != null)
            parent.Child = XmlObjectSerializer.Deserialize(applicationForm.XmlData);
    }

    //...
}
And for saving, we need to override the SaveChanges() method of DbContext, as descibed by Chris McKenzie in this post.
public class Context : DbContext
{
    private void InterceptBefore(ObjectStateEntry item)
    {
        var parent = item.Entity as Parent;

        if (parent!= null)
            parent.XmlData = XmlObjectSerializer.Serialize(applicationForm.Child);
    }

    public override int SaveChanges()
    {
        const EntityState entitiesToTrack = EntityState.Added | EntityState.Modified | EntityState.Deleted;

        var elementsToSave =
            this.ObjectContext
                .ObjectStateManager
                .GetObjectStateEntries(entitiesToTrack)
                .ToList();

        elementsToSave.ForEach(InterceptBefore);
        var result = base.SaveChanges();
        return result;
    }
}
Now if I want to display a list of Parent entities with just their description, this could all become very inefficient. What is needed is some way of lazy loading the child. This could mean the child is a separate entity mapped to a table with its own XML field, but what about in this scenario?
public class Parent
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    public virtual string ChildrenXml {get;set;}
    public virtual ICollection Children {get;set;}
}

public class Child
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    // And other stuff...
}
And I have renamed the column in the database:
CREATE TABLE [Parent](
[Id] [uniqueidentifier] NOT NULL PRIMARY_KEY,
[Description] [nvarchar](50) NULL,
[ChildrenXml] [xml] NULL
)
This rules out the previous option. So surely it's just a simple case of setting the ChildrenXml property to be lazy loaded? Again (and I think more justifiably) I just assumed this would be possible in Entity Framework. I was somewhat surprised to learn that Entity Framework doesn't support lazy loading of properties.

again, the solution is to change our Domain to handle this:
public class Parent
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    public virtual Guid DetailsId {get;set;}
    public virtual ParentDetails Details {get;set;}
}

public class ParentDetails
{
    public virtual Guid Id {get;set;}
    public virtual string ChildrenXml {get;set;}
    public virtual ICollection Children {get;set;}
}

public class Child
{
    public virtual Guid Id {get;set;}
    public virtual string Description {get;set;}
    // And other stuff...
}
And I have renamed the column in the database:
CREATE TABLE [Parent](
[Id] [uniqueidentifier] NOT NULL PRIMARY_KEY,
[Description] [nvarchar](50) NULL,
[DetailsId] [uniqueidentifier] NOT NULL
)

CREATE TABLE [Parent](
[Id] [uniqueidentifier] NOT NULL PRIMARY_KEY,
[ChildrenXml] [xml] NULL
)
This seems to be the officially sanctioned way of doing things. If you have a large field - a BLOB, a VARBINARY, a VARCHAR(MAX), it has to go in a 'Details' table. Surely an O/RM on version 6 should have this functionality?

Tuesday, 26 November 2013

Entities Shouldn't Have Getters

This post follows on from Entities Shouldn't Have Setters.

The principle of Tell, Don't Ask states that one of the consequences of an object having getters is that there is often the temptation for another object to read the values from the getters, make a decision based on those values, and then update the object using the setters (as covered in the last post). A more Object-Oriented design would be to simply 'tell' the object what you want to do, and allow it to make the decision based on its internal state.

However, as acknowledged by Martin Fowler, it is very often the case that we need to display the state of an entity to a UI. However, the use of getters in this case can be avoided by displaying a persistent view model instead of the entity. The persistent view model is what is displayed to the user, and it is updated whenever the entity changes. This is the basic principle behind CQRS.

Imagine our Post object from our blog example in the last post. If we were to have an Edit use case for this, the traditional series of events might be:

  • Call Edit on the Post entity with the new content of the post.
  • Save the Post entity.
  • Load the new Post entity from the database.
  • Display the various properties of the Post entity by accessing its getters.

With CQRS, it might look more like this:

  • Call Edit on the Post entity with the new content of the post.
  • The Post entity raises a PostEdited domain event containing the new values.
  • The application layer handles the domain event, saves the entity and publishes an event on NServiceBus containing the new values.
  • The read model subscribes to this event, and updates the persistent read model accordingly.
  • When the UI is refreshed, it displays the properties of the persistent view model by accessing its getters.

This way we have avoided using the getters of the entity.

For more information on publishing domain events, read This post by Udi Dahan.

So I should immediately stop using getters and setters on my entities?


No, this is somewhat ideological. The first issue I have come across is that it is often desirable to use getters and setters when testing. Getters are useful for the asserts, and setters are useful because for many tests, we do not need to set up the whole entity. Maybe that is breaking some testing ideology, but in the real world, sometimes we need to be able to quickly set up entities without using their use cases.

And of course, CQRS is not always appropriate for every application.

So what was the point of these posts? The point is that you should understand that it can be done, and why and when you may choose to do it. This way you can apply these rules where it makes sense.

So it is unlikely I will be writing entities without getters or setters for now, but there are some rules to take from this:

  • If you are calling getters or setters of an entity from another entity in your domain, you are probably doing something wrong.
  • If you are calling getters or setters of an entity in your application layer you are probably doing something wrong (unless you are mapping to a DTO).
  • If you are calling setters of an entity in your UI you are almost certainly doing something wrong.

Sunday, 27 October 2013

Entities Shouldn't Have Setters

Setters on entities are redundant and a code smell. Take the following example of a blog post:
public class Post : Entity<Guid>
{
    private DateTime _postedOn;
    private Blog _blog;
    private string _text;

    public virtual DateTime PostedOn
    {
        get { return _postedOn; }
        set { _postedOn = value; }
    }

    public virtual Blog Blog 
    {
        get { return _blog; }
        set { _blog = value; }
    }

    public virtual string Text
    {
        get { return _text; }
        set { _text = value; }
    }
}
The problem here is that is that it leaves open the posibility that the entity will be modified from another entity, or even from another layer. Take this example in the application layer:
public class PostService : IPostService
{ 
    public void Compose(ComposePostRequest request)
    {
        var post = new Post();
        post.PostedOn = DateTime.Now;
        post.Blog = _blogRepository.GetById(request.BlogId);
        post.Text = request.Text;
        _postRepository.Save(post);
    }
}
In this example, the Post entity is being constructed in the application layer. The main problem with this is that inevitably business logic seeps into the application layer. What if we had to set an expiry date for the post based on a window set in the blog? Where does this logic go? The result may look like this:
public class PostService : IPostService
{ 
    public void Compose(ComposePostRequest request)
    {
        var blog = _blogRepository.GetById(request.BlogId);
        var post = new Post();
        post.PostedOn = DateTime.Now;
        post.Blog = blog;
        post.Text = request.Text;
        post.ExpiryDate = post.PostedOn.AddDays(blog.PostExpiryWindow);
        _postRepository.Save(post);
    }
}
As more logic comes in, more will be added to the application layer. The logic of the domain cannot be tested in isolation. The only way to test this logic is to mock out the PostRepository and capture the Post that is being saved. A far better way is to do this:
public class Post : Entity<Guid>
{
    private DateTime _postedOn;
    private Blog _blog;
    private string _text;
    private DateTime _expiryDate

    public virtual DateTime PostedOn
    {
        get { return _postedOn; }
    }

    public virtual Blog Blog 
    {
        get { return _blog; }
    }

    public virtual string Text
    {
        get { return _text; }
    }

    public virtual DateTime ExpiryDate
    {
        get { return _expiryDate; }
    }

    public static Post Compose(Blog blog, string text)
    {
        var post = new Post();
        post._postedOn = DateTime.Now;
        post._blog = blog;
        post._text = text;
        post._expiryDate = _postedOn.AddDays(blog.PostExpiryWindow);
        return post;
    }
}

public class PostService : IPostService
{ 
    public void Compose(ComposePostRequest request)
    {
        var blog = _blogRepository.GetById(request.BlogId);
        var post = Post.Compose(blog, request.Text;
        _postRepository.Save(post);
    }
}
Now new business logic can be added to the Compose method in Post, and the logic of Post can be tested in isolation. If you think about it, this all makes perfect sense: Compose is a use case of Post, and why would you want to change the state of an entity out side of a use case?

Persistence


Again, my persistence example is with NHibernate. NHibernate allows us to specify the way it accesses properties, using the 'access' attribute of property:
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="Lucid.Domain.Entities"
                   assembly="Lucid.Domain">
  <class name="Post" table="`Post`">
    <id name="Id" column="Id" type="guid">
      <generator class="assigned"/>
    </id>
    <property name="_postedOn" column="`PostedOn`" access="field" />
    <many-to-one name="_blog" class="Bus" column="`BusId`" cascade="save-update" access="field" />
    <property name="_text" column="`Text`" access="field" />
    <property name="_expiryDate" column="`ExpiryDate`" access="field" />
  </class>
</hibernate-mapping>
The blog and post example may not have been the best example here, as Blog would be an ideal candidate for an aggregate route. In this case, the Blog entity would have a ComposePost method. The Blog would be loaded, the Post would be added and the Blog would be saved. However, aggregate routes are for another post, and for now this serves as a good example.

Monday, 1 July 2013

Mapping the Decorator Pattern in NHibernate

The Decorator Pattern is an useful way of avoiding multiple inheritance class-explosion madness, but in domain modelled enterprise applications, it's not much use unless you can persist it. Neither I, nor anyone on Stack Overflow could figure out a way to do it, until now.

My working code can be found on Github, and a brief overview is described here.

Sorry to use a contrived example, but I could hardly use a production example, and din't have the time to think up anything else, so pizzas it is. At least it's not coffee.

Here is a typical implementation of the pattern:

public interface IPizza
{
    Guid? Id { get; set; }
    int Size { get; set; }
    Quantity Cheese { get; set; }
    Quantity Tomato { get; set; }
    decimal Cost { get; }
    Order Order { get; set; }
}

public class Pizza : Entity, IPizza
{
    public virtual int Size { get; set; }
    public virtual Quantity Cheese { get; set; }
    public virtual Quantity Tomato { get; set; }
    public virtual Order Order { get; set; }

    public static IPizza Create(int size, Quantity cheese, Quantity tomato)
    {
        // Create code...
    }

    public virtual decimal Cost
    {
        // Calculate cost...
    }
}

public class ToppingDecorator : Entity, IPizza
{
    public virtual IPizza BasePizza { get; set; }
    public virtual Order Order { get; set; }

    public ToppingDecorator(IPizza basePizza)
    {
        Id = Guid.NewGuid();
        BasePizza = basePizza;
    }

    public virtual int Size
    {
        get { return BasePizza.Size; }
        set { BasePizza.Size = value; }
    }

    public virtual Quantity Cheese
    {
        get { return BasePizza.Cheese; }
        set { BasePizza.Cheese = value; }
    }

    public virtual Quantity Tomato
    {
        get { return BasePizza.Tomato; }
        set { BasePizza.Tomato = value; }
    }

    public virtual decimal Cost
    {
        get { return BasePizza.Cost; }
    }
}

public class PepperoniDecorator : ToppingDecorator
{
    public virtual bool ExtraSpicy { get; set; }

    public PepperoniDecorator(IPizza basePizza, bool extraSpicy)
        : base(basePizza)
    {
        ExtraSpicy = extraSpicy;
    }

    public override decimal Cost
    {
        get
        {
            // Add to cost...
        }
    }
}

public class OliveDecorator : ToppingDecorator
{
    public virtual OliveColour Colour { get; set; }

    public OliveDecorator(IPizza basePizza, OliveColour colour) : base(basePizza)
    {
        Colour = colour;
    }

    public override decimal Cost
    {
        get
        {
            // Add to cost...
        }
    }
}

public class Order : Entity
{
    public virtual string CustomerName { get; set; }
    public virtual string DeliveryAddress { get; set; }
    public virtual IList Items { get; set; } 

    //Create/Add methods etc...
}
When it came to the database, it was always pretty clear that there would be a Pizza table which would contain all the properties specified in the interface, and then there would be tables for each decorator which contained the particular fields they added, and also a foreign key to either a Pizza or another decorator:
USE [Master]

IF EXISTS (SELECT * FROM sys.databases WHERE NAME = 'Decorator')
BEGIN
 EXEC msdb.dbo.sp_delete_database_backuphistory database_name = N'Decorator';
 ALTER DATABASE [Decorator] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
 DROP DATABASE [Decorator];
END
GO

CREATE DATABASE [Decorator]
GO

USE [Decorator]
GO

IF NOT EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'Pizza')
BEGIN
 CREATE TABLE [dbo].[Pizza](
  [Id] uniqueidentifier NOT NULL PRIMARY KEY,
  [Size] int NULL,
  [Cheese] int NULL,
  [Tomato] int NULL,
  [OrderId] uniqueidentifier NULL
 ); 
END
GO

IF NOT EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'PepperoniDecorator')
BEGIN
 CREATE TABLE [dbo].[PepperoniDecorator](
  [Id] uniqueidentifier NOT NULL PRIMARY KEY,
  [BasePizzaId] uniqueidentifier NULL,
  [ExtraSpicy] bit NULL,
  [OrderId] uniqueidentifier NULL
 );
END
GO

IF NOT EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'OliveDecorator')
BEGIN
 CREATE TABLE [dbo].[OliveDecorator](
  [Id] uniqueidentifier NOT NULL PRIMARY KEY,
  [BasePizzaId] uniqueidentifier NULL,
  [Colour] int NULL,
  [OrderId] uniqueidentifier NULL
 );
END

IF NOT EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'Order')
BEGIN
 CREATE TABLE [dbo].[Order](
  [Id] uniqueidentifier NOT NULL PRIMARY KEY,
  [CustomerName] nvarchar(100) NULL,
  [DeliveryAddress] nvarchar(200) NULL
 );
END
GO
The trick bit was mapping between them. After several failed attempts at using table per class heirachy and table per subclass I came to the conclusion that it wasn't the way to go.

I experimented with table per concrete class using implicit polymorphism but found the limitations of that to be a major issue. Eventually the solution was found using table per concrete class using union-subclass.

Here is how the mappings look:
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="Decorator.Domain.Entities"
                   assembly="Decorator.Domain">
  <class name="IPizza" abstract="true">
    <id name="Id" column="Id" type="guid">
      <generator class="assigned"/>
    </id>
    <many-to-one name="Order" class="Order" column="`OrderId`" cascade="save-update" />
    
    <union-subclass name="Pizza" table ="`Pizza`" >
      <property name="Size" column="`Size`" />
      <property name="Cheese" />
      <property name="Tomato" />
    </union-subclass>

    <union-subclass name="PepperoniDecorator" table ="`PepperoniDecorator`" >
      <many-to-one name="BasePizza" class="IPizza" column="`BasePizzaId`" cascade="all" />
      <property name="ExtraSpicy" column="`ExtraSpicy`" />
    </union-subclass>

    <union-subclass name="OliveDecorator" table ="`OliveDecorator`" >
      <many-to-one name="BasePizza" class="IPizza" column="`BasePizzaId`" cascade="all" />
      <property name="Colour" column="`Colour`" />
    </union-subclass>
  </class>
</hibernate-mapping>

<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="Decorator.Domain.Entities"
                   assembly="Decorator.Domain">
  <class name="Order" table="`Order`">
    <id name="Id" column="Id" type="guid">
      <generator class="assigned"/>
    </id>
    <property name="CustomerName" />
    <property name="DeliveryAddress" />

    <bag name="Items" inverse="true" cascade="save-update">
      <key column="`OrderId`"></key>
      <one-to-many class="IPizza" />
    </bag>
  </class>
</hibernate-mapping>
I have included the Order entity for a good reason here: If you create a Pizza, decorate it with pepperoni, then decorate it with olives and save it, when you get all pizzas, it will actually return 3 pizzas! NHibernate has no way of knowing which pizza is the top level one. This could be avoided by having an IsTopLevel flag, but as pizzas will always be created in the context of an order, it makes sense to only have the orderId on the top level. A similar solution will apply to most scenarios.

Thursday, 27 June 2013

Learning the Hard Way: NHibernate Collections

Here's on that's bitten out team recently: the issue of managing collections of child entities on an parent entity. A number of our records were going missing from the database. How could this be? We don't really delete anything, we just 'soft delete' - setting a flag to mark something as deleted. Take the following entities:
public class Employee
{
    public virtual Guid Id { get; set; }
    public virtual string Name { get; set; }
    public virtual bool Deleted { get; set; }

    public static Employee Create(string name)
    {
        return new Employee
                   {
                       Id = Guid.NewGuid(),
                       Name = name,
                       Deleted = false
                   };
    }
}

public class Team 
{
    public virtual Guid Id { get; set; }
    public virtual string Name { get; set; }
    public virtual IList TeamEmployees { get; set; }
    public virtual bool Deleted { get; set; }

    public static Team Create(string name)
    {
        return new Team
                    {
                        Id = Guid.NewGuid(),
                        Name = name,
                        TeamEmployees = new List(),
                        Deleted = false
                    };
    }

    public virtual void UpdateEmployees(IList employees)
    {
        foreach(var teamEmployee in TeamEmployees.Where(x => !employees.Contains(x.Employee)).Reverse())
        {
            TeamEmployees.Remove(teamEmployee);
        }

        foreach(var employee in employees.Where(x => !TeamEmployees.Select(y => y.Employee).Contains(x)))
        {
            TeamEmployees.Add(TeamEmployee.Create(employee, this));
        }
    }
}

public class TeamEmployee
{
    public virtual Guid Id { get; set; }
    public virtual Employee Employee { get; set; }
    public virtual Team Team { get; set; }
    public virtual bool Deleted { get; set; }

    public static TeamEmployee Create(Employee employee, Team team)
    {
        return new TeamEmployee
                    {
                        Id = Guid.NewGuid(),
                        Employee = employee,
                        Team = team,
                        Deleted = false
                    };
    }
}
The problem here is when you load a team, update the employees and save it - you can be deleting records without realising it. The mapping on Team for TeamEmployees was set to 'all-delete-orphans', so when the association between a Team and an Employee was removed, any record it ever existed was also lost. Even if the cascade had just been 'all', the foreign key to Team would have been nullified and the history would have been lost.

There are a few ways to limit these problems, such as revoking delete access for the database login, and setting all cascades to 'save-update', but it also pays to be cleverer about how collections are handled.

Instead of removing the TeamEmployee record, it is flagged as deleted:
public class Team 
{
    public virtual Guid Id { get; set; }
    public virtual string Name { get; set; }
    public virtual IList TeamEmployees { get; set; }
    public virtual bool Deleted { get; set; }

    public static Team Create(string name)
    {
        return new Team
                    {
                        Id = Guid.NewGuid(),
                        Name = name,
                        TeamEmployees = new List(),
                        Deleted = false
                    };
    }

    public virtual void UpdateEmployees(IList employees)
    {
        foreach(var teamEmployee in TeamEmployees.Where(x => !employees.Contains(x.Employee)).Reverse())
        {
            TeamEmployees.Remove(teamEmployee);
        }

        foreach(var employee in employees.Where(x => !TeamEmployees.Select(y => y.Employee).Contains(x)))
        {
            TeamEmployees.Add(TeamEmployee.Create(employee, this));
        }
    }
}
And the Mapping file has a where clause added to it, so that it only loads the undleted records:
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="Collections.Domain"
                   assembly="Collections.Domain">
  <class name="Team" table="`Team`">
    <id name="Id" column="Id" type="Guid">
      <generator class="assigned"/>
    </id>
    <property name="Name" column="`Name`" />
 <property name="Deleted" />
 <bag name="TeamEmployees" cascade="save-update" where="Deleted = 0" >
      <key column="TeamId"/>
   <one-to-many class="TeamEmployees" />
 </bag>
  </class>
</hibernate-mapping>
Care must be taken when accessing the collection in the same session after deleting a record, as it will be present but marked as deleted. A linq clause '.Where(x => !x.Deleted) should be used.

Another problem would be if the Team and Employee relationship was mapped as many-to-many. There may be a way to soft delete the relationships, but I am not currently aware of it.

Really, this whole scenario is another argument in favour of breaking down all many-to-many relationships with an extra entity. There are others, such as having somewhere to store information about the relationship. Many times I have found it necessary to break down a many-to-many, but never to go the other way. Therefore I am favouring breaking down these relationships as a default.

Fortunately due to our rightfully paranoid auditing and event logging, all customer records were retrived and the data was returned to its expected state.

Friday, 14 June 2013

Overcoming the n+1 issue with one-to-one mapping in NHibernate.

This recently caught me out. In out business domain, we have Task entities. Each task can have a Task that preceeded it, and a Task that follows it. This is modelled like this:
 
namespace OneToOneIssue.Domain
{
    public class Task
    {
        public virtual int Id { get; set; }
        public virtual string Description { get; set; }
        public virtual Task FollowingTask { get; set; }
        public virtual Task PrecedingTask { get; set; }
    }
}
And the database table looks like this:
CREATE TABLE [dbo].[Task](
    [Id] bigint NOT NULL,
    [Description] nvarchar(100) NULL,
    [FollowingTaskId] int NULL
    CONSTRAINT [PK_Task] PRIMARY KEY CLUSTERED 
    (
        [Id] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
and the mapping like this:
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="OneToOneIssue.Domain"
                   assembly="OneToOneIssue.Domain">
  <class name="Task" table="`Task`">
    <id name="Id" column="Id" type="int">
      <generator class="assigned"/>
    </id>
    <property name="Description" column="`Description`" />
    <many-to-one name="FollowingTask" class="Task" column="FollowingTaskId"  />
    <one-to-one name="PrecedingTask" class="Task" property-ref="FollowingTask" lazy="proxy" />
  </class>
</hibernate-mapping>
This all worked fine, but created a serious performace issue when loading a collection of tasks. Take this set of data for example:
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (1, 'Task 1', 2)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (2, 'Task 2', 3)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (3, 'Task 3', NULL)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (4, 'Task 4', 5)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (5, 'Task 5', 6)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (6, 'Task 6', NULL)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (7, 'Task 7', 8)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (8, 'Task 8', 9)
INSERT INTO [Task] ([Id], [Description], [FollowingTaskId]) VALUES (9, 'Task 9', NULL)
Running a query to get records 2, 5 and 8 (The ones with preceding and following tasks) like this:
using (var session = sessionFactory.OpenSession())
{
    var tasks = session
        .CreateCriteria<Task>()
        .Add(Restrictions.In("Id", new[] { 2, 5, 8 }))
        .List<Task>();
}
In this instance, NHibernate would create a query to get the list of tasks, but then perform a query for each task to retrieve its preceding task. You do not need to access the preceding task to make this happen, and setting lazy="false" on the preceding task does not resolve it.

This is a known bug in NHibernate and has been discussed in StackOverflow posts such as this.

One option would be for each record in the Task table to have a FollowingTaskId and a PrecedingTaskId, but this would increase the chance of data inconsistencies and would be mean major, high risk updates to our current data set.

A less invasive way would be to pretend that the preceding task is part of a collection that only ever has one record. A Chinese Collection if you like (it enforces a 'single child policy'). From the point of view of the rest of the code, nothing has changed, and the database can remain the same.

So the entity now looks like this:
namespace OneToOneIssue.Domain
{
    public class Task
    {
        public virtual int Id { get; set; }
        public virtual string Description { get; set; }
        public virtual Task FollowingTask { get; set; }
        private IList _precedingTasks;

        public virtual Task PrecedingTask
        {
            get 
            { 
                return _precedingTasks.FirstOrDefault(); 
            }
            set
            {
                _precedingTasks.Clear();
                _precedingTasks.Add(value);
            }
        }
    }
}
And the mapping changes to this:
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="OneToOneIssue.Domain"
                   assembly="OneToOneIssue.Domain">
  <class name="Task" table="`Task`">
    <id name="Id" column="Id" type="int">
      <generator class="assigned"/>
    </id>
    <property name="Description" column="`Description`" />
    <many-to-one name="FollowingTask" class="Task" column="FollowingTaskId"  />
    <bag name="_precedingTasks" access="field">
      <key column="FollowingTaskId"/>
      <one-to-many class="Task" />
    </bag>
  </class>
</hibernate-mapping>
Now the collection can be queried as before and preceding tasks will not be lazy-loaded untill the code accesses them. Of course if you need the preceding tasks in the collection, you will need to eager load them.

Preceding tasks can be set and saved in the usual way.

Cheers to Radim Köhler for suggesting this solution here.

Thursday, 30 May 2013

Instant Feedback With NServiceBus

The majority of content for this post comes from a thread I started on the NServiceBus Yahoo Group, and this and subsequent posts is really just a summary of that. You can view the original thread here.

One of the problems I have encountered with NServiceBus is the issue of the UI reacting instantly to a user's request. The asynchronous nature of NServiceBus is somewhat in conflict with it.

Take the following example: A grid shows a list of records, each with a delete 'X' on them. The user clicks the X, which sets a 'Deleted' flag and publishes an NServiceBus event, so other systems are informed about the deleted record. (The will possibly be other actions, like adding to an audit trail, updating other entities that were dependant on that record etc).

Conventional architecture in NServiceBus dictates that when the 'X' is clicked, a command is sent from the controller, and the handler for this command performs all the actions, including publishing the event.

But how do we update the grid? We can't just requery the data as we can't be certain the command has been processed. Common practice in NServiceBus is to do one of the following:

  1. We forward the user to a new view which says something like 'You request is being processed, it make take a moment for the grid to be updated' and a link back to the list of records.
  2. We manually remove the record from the existing grid using javascript.
The first option is fine if that is acceptable to the client, but often it is not. Udi says that we should move away from grids, but the fact is that they are often part of the specification. The second option could possibly lead to inconsistency between the business state and the displayed data, and can cause serious headaches when combined with paged & sorted grids.


Option 1 - The 'In Progress' Flag


This involves immediately setting a 'deleting in progress' flag, and then sending the command to carry out the rest of the work:
 
public ActionResult DeleteRecord(Guid recordId)
{
    using(var transactionScope = new TransactionScope())  
    {
        var record = _recordRepository.GetById(recordId);
        record.MarkAsDeletingInProgress();
        _recordRepository.Save(record);
        _bus.Send(new DeleteRecord { RecordId = recordId });
        transactionScope.Complete();
    }

    return RedirectToAction("Index");
}

And the message handler would look like this:
 
public void Handle(DeleteRecord message)
{
    var record = _recordRepository.GetById(recordId);
    record.Delete();
    _recordRepository.Save(record);
    _bus.Publish(new RecordDeleted{ RecordId = recordId });
}  

This way, we can return the grid and the record will either not be present or will be displayed as 'deleting in progress', so the user will have some definite feedback.

It is important that the flag is set and the command is sent within the same transaction to avoid inconsistencies creeping in. The 'using' statement above may not be needed if the request is within a transaction.

Option 2 - Request/Response


Generally frowned upon by the NServiceBus community, synchronous communication is an included feature and can be a useful option. If the command is sent, the message handler can update the database and publish the event. If the command is handled synchronously, by the time it has returned, we can be sure the data has been updated and we can therefore query it.
 
public void DeleteRecordAsync(Guid recordId)
{
    _bus.Send(new DeleteRecord { RecordId = recordId })
        .Register<ReturnCode>(returnCode => AsyncManager.Parameters["returnCode"] = returnCode);
}

public ActionResult DeleteRecordCompleted(ReturnCode returnCode)
{
    return RedirectToAction("Index");
}

And the message handler would look like this:
 
public void Handle(DeleteRecord message)
{
    var record = _recordRepository.GetById(recordId);
    record.Delete();
    _recordRepository.Save(record);
    _bus.Publish(new RecordDeleted{ RecordId = recordId });
    _bus.Return(ReturnCode.OK);
}  

This way, everything in our local domain is handled synchronously, while everything in other services/domains is handled asynchronously. There is even the option that the event can be handled in the local domain, and work can be done asynchronously there.

This may lead to some inconsistencies if the UI is gathering some of that asynchronously handled data, so this technique should be used with caution. However, in the right circumstances, this can be a good way of separating things that NEED to be synchronous from those that CAN be asynchronous.

There is no need for the using transaction statement in this case as NServiceBus message handlers are always run within a transaction by default.


Option 3 - Continuous Polling


Poll for completion and update the UI when the command has been completed. Don't do it.


Option 4 - SignalR


A technology I have not yet investigated. This could be interesting but without knowing more about it I can't comment further.

Option 5 - Publish Events from the Web Application


Another suggestion that raises eyebrows. The main reason for sending the command in the first place was so we can raise the event, so why not just do all the database work in the web application (or other assembly directly referenced) and raise the event from there? I won't cover this here because I intend to cover this and its problems in a future post. However, for now I will just list it as an option.


Thank you to Udi Dahan, Andreas Öhlund and Jimmy Bogard for posting on the thread, as well as the many other contributors. My particular favourite is the interaction Jerdrosenberg described here. I think there are a lot of us who have been through this scenario and it is the kind of thing that prompted me to start the thread and write this post.