Alica's dev blog
Using Distinct operator on owned entities

EF Core’s owned entities don’t always work as you might expect - for example with Distinct operator.

Introduction

If you first want an intro to owned entities, you can start with my previous article.

Here, we will reuse the example from that article.

We have an owned entity Version:

[Owned]
public class Version
{
    public int Major { get; set; }

    public int Minor { get; set; }

    public int Patch { get; set; }
}

And an interface IEntityWithVersion that all entities owning Version implement:

public interface IEntityWithVersion
{
    Version Version { get; set; }
}

Problem

We have a table containing records of type T that implements IEntityWithVersion. It has an Id and also can have other columns, that is symbolized by ....

 Id | Version_Major | Version_Minor | Version_Patch | ...
----|---------------|---------------|---------------|-----
  1 |       1       |       0       |       0       | ...
  2 |       1       |       0       |       1       | ...
  3 |       1       |       0       |       1       | ...

We would like to write a query that returns List<T> and the items in the list are distinct by Version.

For example, a query on the following table should return { Id: 1, Version: 1.0.0 }, { Id: 2, Version: 1.0.1 }.

But it also could be { Id: 1, Version: 1.0.0 }, { Id: 3, Version: 1.0.1 } – we don’t care which row with Version : 1.0.1 is returned. We will take the first one by default.

What doesn’t work

All of the following attempts have the same principle:

  1. Select a distinct list of Versions.
  2. Join it with the list of rows.
  3. Take the first row for every Version.

Attempt 1: Start simply

Let’s start with the simplest version possible:

public static IQueryable<T> DistinctByVersion<T>(this IQueryable<T> items) where T : IEntityWithVersion
{
    return items
        .Select(x => x.Version)
        .Distinct()
        .Select(v => items
            .FirstOrDefault(x => x.Version == v));
}

Unfortunately, the execution of this query throws an exception and complains that it cannot be translated into SQL.

Attempt 2: Improve comparison

If we suspect that the previous attempt failed because of the x.Version == v comparison, then we can try to replace it with a more “primitive” one:

public static IQueryable<T> DistinctByVersion<T>(this IQueryable<T> items) where T : IEntityWithVersion
{
    return items
        .Select(x => x.Version)
        .Distinct()
        .Select(v => items
            .FirstOrDefault(x => x.Version.Major == v.Major
                                 && x.Version.Minor == v.Minor
                                 && x.Version.Patch == v.Patch));
}

This query is successfully executed, but doesn’t return the expected result – instead of two rows, it returns three, even though there are only two distinct Versions in our table.

A quick look on the generated SQL reveals the source of the problem:

SELECT DISTINCT "t"."Id", "t"."Version_Major", "t"."Version_Minor", "t"."Version_Patch"

There is the Id column in the query – that means that it will always return all rows because Id is a unique identifier. I am not exactly sure why this happens – but it does happen.

Attempt 3: Override Equals()

Maybe the Distinct function includes the Id column because a default Equals method is somehow different to what we would expect?

Let’s try to override it:

public bool Equals([AllowNull] Version version)
{
    return version != null &&
            Major == version.Major &&
            Minor == version.Minor &&
            Patch == version.Patch;
}

That doesn’t help either – the result is the same as in the previous attempt.

Attempt 4: Create a Comparer

Ok, so maybe we could try to explicitly pass a comparer to the Distinct function (it’s using the same logic as Equals above):

public static IQueryable<T> DistinctByVersion<T>(this IQueryable<T> items) where T : IEntityWithVersion
{
    return items
        .Select(x => x.Version)
        .Distinct(new VersionComparer())
        .Select(v => items
            .FirstOrDefault(x => x.Version.Major == v.Major
                                 && x.Version.Minor == v.Minor
                                 && x.Version.Patch == v.Patch));
}

Again, no success. EF Core doesn’t know how to translate the comparer’s logic to SQL, and we get System.InvalidOperationException.

What does work

Anonymous projection!

By calling Distinct on the collection of anonymous objects, we overcome the issue with Id added to the DISTINCT part of the SQL query.

public static IQueryable<T> DistinctByVersion<T>(this IQueryable<T> items) where T : IEntityWithVersion
{
    return items
        .Select(x => x.Version)
        .Select(v => new { v.Major, v.Minor, v.Patch })
        .Distinct()
        .Select(v => items
            .FirstOrDefault(x => x.Version.Major == v.Major
                                 && x.Version.Minor == v.Minor
                                 && x.Version.Patch == v.Patch));
}

Not the prettiest solution, but a solution :-)


Last modified on 2021-05-21