Thursday 27 June 2013

Learning the Hard Way: NHibernate Collections

Here's on that's bitten out team recently: the issue of managing collections of child entities on an parent entity. A number of our records were going missing from the database. How could this be? We don't really delete anything, we just 'soft delete' - setting a flag to mark something as deleted. Take the following entities:
public class Employee
{
    public virtual Guid Id { get; set; }
    public virtual string Name { get; set; }
    public virtual bool Deleted { get; set; }

    public static Employee Create(string name)
    {
        return new Employee
                   {
                       Id = Guid.NewGuid(),
                       Name = name,
                       Deleted = false
                   };
    }
}

public class Team 
{
    public virtual Guid Id { get; set; }
    public virtual string Name { get; set; }
    public virtual IList TeamEmployees { get; set; }
    public virtual bool Deleted { get; set; }

    public static Team Create(string name)
    {
        return new Team
                    {
                        Id = Guid.NewGuid(),
                        Name = name,
                        TeamEmployees = new List(),
                        Deleted = false
                    };
    }

    public virtual void UpdateEmployees(IList employees)
    {
        foreach(var teamEmployee in TeamEmployees.Where(x => !employees.Contains(x.Employee)).Reverse())
        {
            TeamEmployees.Remove(teamEmployee);
        }

        foreach(var employee in employees.Where(x => !TeamEmployees.Select(y => y.Employee).Contains(x)))
        {
            TeamEmployees.Add(TeamEmployee.Create(employee, this));
        }
    }
}

public class TeamEmployee
{
    public virtual Guid Id { get; set; }
    public virtual Employee Employee { get; set; }
    public virtual Team Team { get; set; }
    public virtual bool Deleted { get; set; }

    public static TeamEmployee Create(Employee employee, Team team)
    {
        return new TeamEmployee
                    {
                        Id = Guid.NewGuid(),
                        Employee = employee,
                        Team = team,
                        Deleted = false
                    };
    }
}
The problem here is when you load a team, update the employees and save it - you can be deleting records without realising it. The mapping on Team for TeamEmployees was set to 'all-delete-orphans', so when the association between a Team and an Employee was removed, any record it ever existed was also lost. Even if the cascade had just been 'all', the foreign key to Team would have been nullified and the history would have been lost.

There are a few ways to limit these problems, such as revoking delete access for the database login, and setting all cascades to 'save-update', but it also pays to be cleverer about how collections are handled.

Instead of removing the TeamEmployee record, it is flagged as deleted:
public class Team 
{
    public virtual Guid Id { get; set; }
    public virtual string Name { get; set; }
    public virtual IList TeamEmployees { get; set; }
    public virtual bool Deleted { get; set; }

    public static Team Create(string name)
    {
        return new Team
                    {
                        Id = Guid.NewGuid(),
                        Name = name,
                        TeamEmployees = new List(),
                        Deleted = false
                    };
    }

    public virtual void UpdateEmployees(IList employees)
    {
        foreach(var teamEmployee in TeamEmployees.Where(x => !employees.Contains(x.Employee)).Reverse())
        {
            TeamEmployees.Remove(teamEmployee);
        }

        foreach(var employee in employees.Where(x => !TeamEmployees.Select(y => y.Employee).Contains(x)))
        {
            TeamEmployees.Add(TeamEmployee.Create(employee, this));
        }
    }
}
And the Mapping file has a where clause added to it, so that it only loads the undleted records:
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="Collections.Domain"
                   assembly="Collections.Domain">
  <class name="Team" table="`Team`">
    <id name="Id" column="Id" type="Guid">
      <generator class="assigned"/>
    </id>
    <property name="Name" column="`Name`" />
 <property name="Deleted" />
 <bag name="TeamEmployees" cascade="save-update" where="Deleted = 0" >
      <key column="TeamId"/>
   <one-to-many class="TeamEmployees" />
 </bag>
  </class>
</hibernate-mapping>
Care must be taken when accessing the collection in the same session after deleting a record, as it will be present but marked as deleted. A linq clause '.Where(x => !x.Deleted) should be used.

Another problem would be if the Team and Employee relationship was mapped as many-to-many. There may be a way to soft delete the relationships, but I am not currently aware of it.

Really, this whole scenario is another argument in favour of breaking down all many-to-many relationships with an extra entity. There are others, such as having somewhere to store information about the relationship. Many times I have found it necessary to break down a many-to-many, but never to go the other way. Therefore I am favouring breaking down these relationships as a default.

Fortunately due to our rightfully paranoid auditing and event logging, all customer records were retrived and the data was returned to its expected state.

No comments:

Post a Comment