EF Core 6 – Handling JSON data in SQL Server

We have seen a few projects where the client insists on using SQL Server, but their workload generates a bunch of semi-structured JSON data for us to store. Now we need to bridge the gap and efficiently query it. One realistic case study where this situation may pop up would be building a browser extension that helps capture and organise wish lists from various platforms such as Amazon or eBay.

Let’s build a wish list

Our journey would likely start with a web browser. We may have a browser extension of some sort that would send information about an item we’re currently watching to an API. That API will have to make sense of all different attributes each website provides and try store it in a way that makes querying as easy and efficient as possible.

The catch is, we don’t know full list of attributes. And, probably, we should not care – perfect data model for frequently changing e-commerce websites would be hard to deduce and likely very painful to maintain. We could just lump it all together and chuck it into a schema-less data store like Cosmos DB or Mongo, but for the sake of argument we’ll imagine the client was adamant they needed a SQL Server because it’s a business decision they made a long time ago.

Not a huge drama

SQL Server has functions to work with JSON data – all we have to do is put it into a sufficiently big column. We can save heaps of time creating one simple data model and calling it a day.

Moving on to coding, let us focus on data retrieval and skip the API part for now – it’s a whole different can or worms and will distract us from our point. From here on we’ll be exploring ways to query the data

Test Bench

We’ve created a small EF Core project where DBContext looks something like this:

class SHDbContext : DbContext
{
	public DbSet<WishlistEntry> WishlistEntries { get; set; }

	protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
	{
		base.OnConfiguring(optionsBuilder);
		optionsBuilder.UseSqlServer(Environment.GetEnvironmentVariable("DB_CONNSTR")).LogTo(Console.WriteLine);
	}
}

where WishlistEntry is defined like so:

class WishlistEntry
{

	public int Id { get; set; }
	public int CustomerId { get; set; }
	[MaxLength(300)]
	public string ItemSource { get; set; }
	public string ItemData { get; set; }

	[DatabaseGenerated(DatabaseGeneratedOption.Computed)]
	public string ItemCategory { get; set; }
}

class ItemData
{
	public string Sku { get; set; }
	public string Name { get; set; }
	public List<Category> Category { get; set; }
}

class Category
{
	public string Id { get; set; }
	public string Name { get; set; }
}

with setup out of the way, let us define a sample entry and write a query to pick all entries that contain a category “Connected Home & Housewares”:

{
  "sku": 185267,
  "name": "Duracell - D Batteries (4-Pack)",
  "type": "HardGood",
  "price": 9.99,
  "upc": "041333430010",
  "category": [
    {
      "id": "pcmcat312300050015",
      "name": "Connected Home & Housewares"
    },
    {
      "id": "pcmcat248700050021",
      "name": "Housewares"
    },
    {
      "id": "pcmcat303600050001",
      "name": "Household Batteries"
    },
    {
      "id": "abcat0208002",
      "name": "Alkaline Batteries"
    }
  ],
  "shipping": 5.99,
  "description": "Compatible with select electronic devices; D size; DURALOCK Power Preserve technology; 4-pack",
  "manufacturer": "Duracell",
  "model": "MN1300R4Z",
  "url": "http://www.bestbuy.com/site/duracell-d-batteries-4-pack/185267.p?id=1051384046551&skuId=185267&cmp=RMXCC",
  "image": "http://img.bbystatic.com/BestBuy_US/images/products/1852/185267_sa.jpg"
}

Option one – JsonConvert.Deserialize

The naive approach to the desired outcome would be to simply try and deserialise the field and pray Entity Framework would work its magic:

var context = new SHDbContext();
//var sql = context.WishlistEntries.Select(wle => JsonConvert.DeserializeObject<ItemData>(wle.ItemData)
//                                                .Category.Any(c => c.Name == "Connected Home & Housewares")); //The LINQ expression 'c => c.Name == "Connected Home & Housewares"' could not be translated. 

Unfortunately, this would not even compile as EF has no idea what to do with JsonConvert. We’ll end up having to materialise the query and handle filtering on the client side:

var data = context.WishlistEntries.Select(wle => JsonConvert.DeserializeObject<ItemData>(wle.ItemData))
                                .AsEnumerable()
                                .Where(d => d.Category.Any(c => c.Name == "Connected Home & Housewares"))
                                .ToList();

This will work as expected, and some developers will probably stop here, but this is in fact a time bomb. When our wish lists get big enough, the clients will put a serious strain on SQL Server and the network fetching the whole table in its entirety. Not nice.

Option two – write a Value Converter

Technically this is a variation of the previous option but leveraging EF Core Value Conversion capability for nicer looking query. To enable this, we’ll override OnModelCreating and add a few more lines to DB Context:

class SHDbContext : DbContext
{
	public DbSet<WishlistEntry> WishlistEntries { get; set; }

	protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
	{
		base.OnConfiguring(optionsBuilder);
		optionsBuilder.UseSqlServer(Environment.GetEnvironmentVariable("DB_CONNSTR")).LogTo(Console.WriteLine);
	}

	protected override void OnModelCreating(ModelBuilder modelBuilder)
	{
		base.OnModelCreating(modelBuilder);
		modelBuilder.Entity<WishlistEntry>()
		            .Property(e => e.ItemData)
		            .HasConversion((itemData) => JsonConvert.SerializeObject(itemData), str => JsonConvert.DeserializeObject<ItemData>(str));
	}
}
// ...
// finally, use the new converter like so:
var data = context.WishlistEntries.AsEnumerable().Where(d => d.ItemData.Category.Any(c => c.Name == "Connected Home & Housewares")).ToList();
 // also not ideal as we still have to filter on client side, effectively scanning all tabl

This simply shifts the JSON conversion code up the food chain, so developers won’t see it often. Unfortunately, this “out of sight – out of mind” approach won’t improve our query plan.

Option three – JSON_VALUE function

It gets pretty clear at this stage that we need to be able to filter by json fields on SQL Server side. Luckily, Microsoft provides a few json-handling capabilities. JSON_VALUE function that allows extracting one value out of the blob via given path selector. This is going to eliminate the need to land our query and should hugely improve performance. Entity Framework does not support this function by default, but it should be very easy to add:

// in the context class
 // adding static function to the DB Context
[DbFunction("JSON_VALUE", IsBuiltIn = true, IsNullable = false)]
public static string JsonValue(string expression, string path) => throw new NotImplementedException();

our query then becomes something like this:

var data = context.WishlistEntries.Where(d => SHDbContext.JsonValue(d.ItemData, "$.category[0].name").Contains("Connected Home & Housewares")).ToList();

And this is the correct answer. However, we can further improve performance by adding an index. Indexing the whole json blob will not help, but we can create a computed column with the value we care about and index that instead:

-- add computed column
ALTER TABLE WishlistEntries ADD ItemCategory AS JSON_VALUE(ItemData, '$.category[0].name')
-- index the value
CREATE INDEX IX_ItemCategory ON WishlistEntries(ItemCategory) INCLUDE(CustomerId, ItemSource, ItemData);
-- finally, test if the query picks up index by running the following and checking the Actual Execution Plan:
/* test queries */
SELECT * FROM [WishlistEntries] WHERE JSON_VALUE(ItemData, '$.category[0].name') = 'Connected Home & Housewares'

This enhancement also happens to simplify our code a bit as we can now rely on computed column value when writing queries in EF:

// add computed column to WishlistEntry
class WishlistEntry
{

	public int Id { get; set; }
	public int CustomerId { get; set; }
	[MaxLength(300)]
	public string ItemSource { get; set; }
	public string ItemData { get; set; } 

	[DatabaseGenerated(DatabaseGeneratedOption.Computed)]
	public string ItemCategory { get; set; }
}

// finally the query will be like:
var data = context.WishlistEntries.Where(wle => wle.ItemCategory == "Connected Home & Housewares").ToList();

Conclusion and area for improvement

Keen eyed readers may have noticed that we hardcoded query to first array item. This was a deliberate simplification we opted for to illustrate the point. It should be possible to query an array using OPENJSON … CROSS APPLY like so, but we haven’t explored this avenue far enough to see if we’d ultimately succeed:

var data = context.WishlistEntries.FromSqlRaw(@"SELECT TOP 1 wishlist.*
      FROM [WishlistEntries] as wishlist
CROSS APPLY OPENJSON(ItemData, '$.category') WITH (CatName VARCHAR(200) '$.name') as categories
WHERE categories.CatName = {0}", "Connected Home & Housewares").ToList();

Even though our application is far from complete we are in a good place. Equipped with these techniques, we can handle almost anything users throw at us and are still able to query data fast enough. The code for this exercise is available on GitHub.

EF Core 6 – custom functions with DbFunction Attribute

We’ve already looked at way to implement SQL functions via method translation. That went reasonably well, but next time we had to do something similar we discovered that our code is broken with newer versions of EF Core. We fixed it again.

Not anymore

Looking through changelogs, we noticed that EF Core 2.0 came with support for mapping scalar functions. It is remarkably simple to set up:

public class MyDbContext : DbContext
    {

        [DbFunction("DECRYPTBYPASSPHRASE", IsBuiltIn = true, IsNullable = false)]
        public static byte[] DecryptByPassphrase(string pass, byte[] ciphertext) => throw new NotImplementedException();

        [DbFunction("DECRYPTBYKEY", IsBuiltIn = true, IsNullable = false)]
        public static byte[] DecryptByKey(byte[] ciphertext) => throw new NotImplementedException();
...

and even easier to use:

var filteredSet = Set
                .Select(m => new Model
                {
                    Id = m.Id,
                    Decrypted = MyDbContext.DecryptByPassphrase("TestPassword", m.Encrypted).ToString(),
                    Decrypted2 = MyDbContext.DecryptByKey(m.Encrypted2).ToString(), // since the key's opened for session scope - just relying on it should do the trick
                }).ToList();

Initially the attribute was offering limited configuration options, but starting EF Core 5.0, this is not an issue.

One gotcha with DECRYPT* functions is they return varbinary. Trying to use our own EF.Functions.ConvertToVarChar is not going to work since we disabled custom plugins. We want to get rid of this code after all. But Apparently .ToString() works as intended:

SELECT [m].[Id], CONVERT(varchar(100), DECRYPTBYPASSPHRASE(N'TestPassword', [m].[Encrypted])) AS [Decrypted], CONVERT(varchar(100), DECRYPTBYKEY([m].[Encrypted2])) AS [Decrypted2], [t].[Id], [t].[IsSomething], [m].[Encrypted], [m].[Encrypted2]...

Full example source is in GitHub, along with other takes we decided to leave in place for history.

Conclusion

Defining custom EF functions was one of the biggest articles we wrote here. And finding out how to fit it together probably was the most challenging and time-consuming project we undertook in recorded history. One can say we totally wasted our time, but I’d like to draw a different conclusion. We had fun, learned something new and were able to appreciate the complexity behind Entity Framework – it is not just an engineering marvel – it is also a magical beast!

Azure Functions – OpenAPI + EF Core = 💥

Creating Swagger-enabled Azure Functions is not that hard to do. Visual Studio literally comes with a template for that:

Inspecting the newly created project we see that it comes down to one NuGet package. It magically hooks into IWebJobsStartup and registers additional routes for Swagger UI and OpenAPI document. When run, it reflects upon suitable entry points in the assembly and builds required responses on the fly. Elegant indeed.

Installing Entity Framework

Now, suppose, we need to talk to Azure SQL. So, we’d like to add EF Core to the mix. As much as we love to go for latest and greatest, unfortunately it’s a bit messy at the moment. Instead let’s get a bit more conservative and stick to EFCore 3.1.

We did not expect that, did we?

The error message is pretty clear: the assembly somehow did not get copied to the output location. And indeed, the file was missing:

Apparently when VS builds the function, it makes a second copy of the libraries it thinks are required. And in our case, it decided it’s not picking up the dependency. Adding <_FunctionsSkipCleanOutput>true</_FunctionsSkipCleanOutput> to the project file will fix that:

Are there yet?

Probably, but there’s a catch: our deployment package just got bigger. Alternatively, we could downgrade EF Core to 3.1.13 which happens to use the same version of Microsoft.Extensions.Logging.Abstractions. This way we’d avoid having to hack project files at expense or limiting ourselves to an older version of EF Core. Ultimately, we hope OpenAPI extension picks up the slack and goes GA soon. For now, looks like we’ll have to stick to it.

EF Core 6 – does our IMethodCallTranslator still work?

A year and a half ago we posted an article on how we were able to plug into EF Core pipeline and inject our own IMethodCallTranslator. That let us leverage SQL-native encryption functionality with EF Core 3.1 and was ultimately a win. A lot has changed in the ecosystem, we’ve got .NET 5 and and .NET 6 coming up soon. So, we could not help but wonder…

Will it work with EF Core 6?

Apparently, EF6 is mostly an evolutionary step over EF5. That said, we totally missed previous version. So it is unclear to what extent the EF team has reworked their internal APIs. Most of the extensibility points we used were internal and clearly marked as “not for public consumption”. With that in mind, our concerns seemed valid.

Turns out, the code needs change…

The first issue we needed to rectify was implementing ShouldUseSameServiceProvider: from what I can tell, it’s needed to cache services more efficiently, but in our case setting it to default value seems to make sense.

But that’s where things really went sideways:

Apparently adding our custom IDbContextOptionsExtension resets the cache and by the time EF arrives at Model initialisation, instance of DI container gets wiped, leaving us with a bunch of null references (including the one above).

One line fix

I am still unsure why EF so upset when we add new extension. Stepping through the code would likely provide me with the answer but I feel it’s not worth the effort. Playing around with service scopes I however noticed that many built-in services get registered using different extension method with Scoped lifecycle. This prompted me to try change my registration method signature and voila:

And as usual, fully functioning code sits on GitHub.

EF core 3.1: dynamic GroupBy clause

Expanding on many ways we can build dynamic clauses for EF execution, let us look at another example. Suppose we’d like to give our users ability to group items based on a property of their choice.

As usual, our first choice would be to build a LINQ expression as we did with the WHERE case. There’s however a problem: GroupBy needs an object to use as a grouping key. Usually, we’d just make an anonymous type, but it is a compile-time luxury we don’t get with LINQ:

dbSet.GroupBy(s => new {s.Col1, s.Col2}); // not going to fly :/

IL Emit it is then

So, it seems we are left with no other choice but to go through TypeBuilder ordeal (which isn’t too bad really). One thing to point out here – we want to create properties that EF will later use for grouping key. This is where ability to interrogate EF as-built model comes very handy. Another important point – creating a property in fact means creating a private backing field and two special methods for each. We certainly started to appreciate how much the language and runtime do for us:

private void CreateProperty(TypeBuilder typeBuilder, string propertyName, Type propertyType)
{
 // really, just generating "public PropertyType propertyName {get;set;}"
	FieldBuilder fieldBuilder = typeBuilder.DefineField("_" + propertyName, propertyType, FieldAttributes.Private);

	PropertyBuilder propertyBuilder = typeBuilder.DefineProperty(propertyName, PropertyAttributes.HasDefault, propertyType, null);
	MethodBuilder getPropMthdBldr = typeBuilder.DefineMethod("get_" + propertyName, MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig, propertyType, Type.EmptyTypes);
	ILGenerator getIl = getPropMthdBldr.GetILGenerator();

	getIl.Emit(OpCodes.Ldarg_0);
	getIl.Emit(OpCodes.Ldfld, fieldBuilder);
	getIl.Emit(OpCodes.Ret);

	MethodBuilder setPropMthdBldr = typeBuilder.DefineMethod("set_" + propertyName,
		  MethodAttributes.Public |
		  MethodAttributes.SpecialName |
		  MethodAttributes.HideBySig,
		  null, new[] { propertyType });

	ILGenerator setIl = setPropMthdBldr.GetILGenerator();
	Label modifyProperty = setIl.DefineLabel();
	Label exitSet = setIl.DefineLabel();

	setIl.MarkLabel(modifyProperty);
	setIl.Emit(OpCodes.Ldarg_0);
	setIl.Emit(OpCodes.Ldarg_1);
	setIl.Emit(OpCodes.Stfld, fieldBuilder);

	setIl.Emit(OpCodes.Nop);
	setIl.MarkLabel(exitSet);
	setIl.Emit(OpCodes.Ret);

	propertyBuilder.SetGetMethod(getPropMthdBldr);
	propertyBuilder.SetSetMethod(setPropMthdBldr);
}

after we’ve sorted this out – it’s pretty much the same approach as with any other dynamic LINQ expression: we need to build something like DbSet.GroupBy(s => new dynamicType {col1 = s.q1, col2 = s.q2}). There’s a slight issue with this lambda however – it returns IGrouping<dynamicType, TElement> – and since outside code has no idea of the dynamic type – there’s no easy way to work with it (unless we want to keep reflecting). I thought it might be easier to build a Select as well and return a Count against each instance of dynamic type. Luckily, we only needed a count, but other aggregations work in similar fashion.

Finally

I arrived at the following code to generate required expressions:

public static IQueryable<Tuple<object, int>> BuildExpression<TElement>(this IQueryable<TElement> source, DbContext context, List<string> columnNames)
{
	var entityParameter = Expression.Parameter(typeof(TElement));
	var sourceParameter = Expression.Parameter(typeof(IQueryable<TElement>));

	var model = context.Model.FindEntityType(typeof(TElement)); // start with our own entity
	var props = model.GetPropertyAccessors(entityParameter); // get all available field names including navigations			

	var objectProps = new List<Tuple<string, Type>>();
	var accessorProps = new List<Tuple<string, Expression>>();
	var groupKeyDictionary = new Dictionary<object, string>();
	foreach (var prop in props.Where(p => columnNames.Contains(p.Item3)))
	{
		var propName = prop.Item3.Replace(".", "_"); // we need some form of cross-reference, this seems to be good enough
		objectProps.Add(new Tuple<string, Type>(propName, (prop.Item2 as MemberExpression).Type));
		accessorProps.Add(new Tuple<string, Expression>(propName, prop.Item2));
	}

	var groupingType = BuildGroupingType(objectProps); // build new type we'll use for grouping. think `new Test() { A=, B=, C= }`

	// finally, we're ready to build our expressions
	var groupbyCall = BuildGroupBy<TElement>(sourceParameter, entityParameter, accessorProps, groupingType); // source.GroupBy(s => new Test(A = s.Field1, B = s.Field2 ... ))
	var selectCall = groupbyCall.BuildSelect<TElement>(groupingType); // .Select(g => new Tuple<object, int> (g.Key, g.Count()))
	
	var lambda = Expression.Lambda<Func<IQueryable<TElement>, IQueryable<Tuple<object, int>>>>(selectCall, sourceParameter);
	return lambda.Compile()(source);
}

private static MethodCallExpression BuildSelect<TElement>(this MethodCallExpression groupbyCall, Type groupingAnonType) 
{	
	var groupingType = typeof(IGrouping<,>).MakeGenericType(groupingAnonType, typeof(TElement));
	var selectMethod = QueryableMethods.Select.MakeGenericMethod(groupingType, typeof(Tuple<object, int>));
	var resultParameter = Expression.Parameter(groupingType);

	var countCall = BuildCount<TElement>(resultParameter);
	var resultSelector = Expression.New(typeof(Tuple<object, int>).GetConstructors().First(), Expression.PropertyOrField(resultParameter, "Key"), countCall);

	return Expression.Call(selectMethod, groupbyCall, Expression.Lambda(resultSelector, resultParameter));
}

private static MethodCallExpression BuildGroupBy<TElement>(ParameterExpression sourceParameter, ParameterExpression entityParameter, List<Tuple<string, Expression>> accessorProps, Type groupingAnonType) 
{
	var groupByMethod = QueryableMethods.GroupByWithKeySelector.MakeGenericMethod(typeof(TElement), groupingAnonType);
	var groupBySelector = Expression.Lambda(Expression.MemberInit(Expression.New(groupingAnonType.GetConstructors().First()),
			accessorProps.Select(op => Expression.Bind(groupingAnonType.GetMember(op.Item1)[0], op.Item2))
		), entityParameter);

	return Expression.Call(groupByMethod, sourceParameter, groupBySelector);
}

private static MethodCallExpression BuildCount<TElement>(ParameterExpression resultParameter)
{
	var asQueryableMethod = QueryableMethods.AsQueryable.MakeGenericMethod(typeof(TElement));
	var countMethod = QueryableMethods.CountWithoutPredicate.MakeGenericMethod(typeof(TElement));

	return Expression.Call(countMethod, Expression.Call(asQueryableMethod, resultParameter));
}

And the full working version is on my GitHub

Entity Framework Core 3.1 – dynamic WHERE clause

Every now and then we get tasked with building a backend for filtering arbitrary queries. Usually clients would like to have a method of sending over an array of fields, values, and comparisons operations in order to retrieve their data. For simplicity we’ll assume that all conditions are joining each other with an AND operator.

public class QueryableFilter {
    public string Name { get; set; }
    public string Value { get; set; }
    public QueryableFilterCompareEnum? Compare { get; set; }
}

With a twist

There’s however one slight complication to this problem – filters must apply to fields on dependent entities (possible multiple levels of nesting as well). This can become a problem not only because we’d have to traverse model hierarchy (we’ll touch on that later), but also because of ambiguity this requirement introduces. Sometimes we’re lucky to only have unique column names across the hierarchy. However more often than not this needs to be resolved one way or another. We can, for example, require filter fields to use dot notation so we know which entity each field relates to. For example, Name -eq "ACME Ltd" AND Name -eq "Cloud Solutions" becomes company.Name -eq "ACME Ltd" AND team.Name -eq "Cloud Solutions"

Building an expression

It is pretty common that clients already have some sort of data querying service with EF Core doing the actual database comms. And since EF relies on LINQ Expressions a lot – we can build required filters dynamically.

public static IQueryable<T> BuildExpression<T>(this IQueryable<T> source, DbContext context, string columnName, string value, QueryableFilterCompareEnum? compare = QueryableFilterCompareEnum.Equal)
{
	var param = Expression.Parameter(typeof(T));

	// Get the field/column from the Entity that matches the supplied columnName value
	// If the field/column does not exists on the Entity, throw an exception; There is nothing more that can be done
	MemberExpression dataField;
	
	var model = context.Model.FindEntityType(typeof(T)); // start with our own entity
	var props = model.GetPropertyAccessors(param); // get all available field names including navigations
	var reference = props.First(p => RelationalPropertyExtensions.GetColumnName(p.Item1) == columnName); // find the filtered column - you might need to handle cases where column does not exist

	dataField = reference.Item2 as MemberExpression; // we happen to already have correct property accessors in our Tuples	

	ConstantExpression constant = !string.IsNullOrWhiteSpace(value)
		? Expression.Constant(value.Trim(), typeof(string))
		: Expression.Constant(value, typeof(string));

	BinaryExpression binary = GetBinaryExpression(dataField, constant, compare);
	Expression<Func<T, bool>> lambda = (Expression<Func<T, bool>>)Expression.Lambda(binary, param);
	return source.Where(lambda);
}

Most of the code above is pretty standard for building property accessor lambdas, but GetPropertyAccessors is the key:

private static IEnumerable<Tuple<IProperty, Expression>> GetPropertyAccessors(this IEntityType model, Expression param)
{
	var result = new List<Tuple<IProperty, Expression>>();

	result.AddRange(model.GetProperties()
								.Where(p => !p.IsShadowProperty()) // this is your chance to ensure property is actually declared on the type before you attempt building Expression
								.Select(p => new Tuple<IProperty, Expression>(p, Expression.Property(param, p.Name)))); // Tuple is a bit clunky but hopefully conveys the idea

	foreach (var nav in model.GetNavigations().Where(p => p is Navigation))
	{
		var parentAccessor = Expression.Property(param, nav.Name); // define a starting point so following properties would hang off there
		result.AddRange(GetPropertyAccessors(nav.ForeignKey.PrincipalEntityType, parentAccessor)); //recursively call ourselves to travel up the navigation hierarchy
	}

	return result;
}

this is where we interrogate EF as-built data model, traverse navigation properties and recursively build a list of all properties we can ever filter on!

Testing it out

Talk is cheap, let’s run a complete example here:

public class Entity
{
	public int Id { get; set; }
}
class Company : Entity
{
	public string CompanyName { get; set; }
}

class Team : Entity
{
	public string TeamName { get; set; }
	public Company Company { get; set; }
}

class Employee : Entity
{
	public string EmployeeName { get; set; }
	public Team Team { get; set; }
}

class DynamicFilters<T> where T : Entity
{
	private readonly DbContext _context;

	public DynamicFilters(DbContext context)
	{
		_context = context;
	}

	public IEnumerable<T> Filter(IEnumerable<QueryableFilter> queryableFilters = null)
	{
		IQueryable<T> mainQuery = _context.Set<T>().AsQueryable().AsNoTracking();
		// Loop through the supplied queryable filters (if any) to construct a dynamic LINQ-to-SQL queryable
		foreach (var filter in queryableFilters ?? new List<QueryableFilter>())
		{
			mainQuery = mainQuery.BuildExpression(_context, filter.Name, filter.Value, filter.Compare);
		}

		mainQuery = mainQuery.OrderBy(x => x.Id);

		return mainQuery.ToList();
	}
}
// --- DbContext
class MyDbContext : DbContext
{
	public DbSet<Company> Companies { get; set; }
	public DbSet<Team> Teams { get; set; }
	public DbSet<Employee> Employees { get; set; }

	protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
	{
		optionsBuilder.UseSqlServer("Server=.\\SQLEXPRESS;Database=test;Trusted_Connection=true");
		base.OnConfiguring(optionsBuilder);
	}
}
// ---
static void Main(string[] args)
{
	var context = new MyDbContext();
	var someTableData = new DynamicFilters<Employee>(context).Filter(new
	List<QueryableFilter> { new QueryableFilter { Name = "CompanyName", Value = "ACME Ltd" }, new QueryableFilter { Name = "TeamName", Value = "Cloud Solutions" } });
}

The above block should produce following SQL:

SELECT [e].[Id], [e].[EmployeeName], [e].[TeamId]
FROM [Employees] AS [e]
LEFT JOIN [Teams] AS [t] ON [e].[TeamId] = [t].[Id]
LEFT JOIN [Companies] AS [c] ON [t].[CompanyId] = [c].[Id]
WHERE [c].[CompanyName] = N'ACME Ltd'
 AND [t].[TeamName] = N'Cloud Solutions'
ORDER BY [e].[Id]

Entity Framework Core 3.1 – Peeking Into Generated SQL

Writing LINQ that produces optimal SQL can be even harder as developers often don’t have visibility into the process. It becomes even more confusing when the application is designed to run against different databases.

We often find ourselves questioning whether this particular query will fall in line with our expectations. And until not so long ago our tool of choice was a SQL Profiler, that ships with SQL Server. It’s plenty powerful but has one flaw – it pretty much requires the SQL Server installation. This might be a deal breaker for some clients using other DBs, like Postgres or MySQL (which are all supported by the way).

EF to the resque

Instead of firing off the profiler and fishing out the batches, we could have Entity Framework itself pass us the result. After all, it needs to build SQL before sending it off the the database, so all we have to do it to ask nicely. Stack Overflow is quite helpful here:

public static class IQueryableExtensions // this is the EF Core 3.1 version.
    {
        public static string ToSql<TEntity>(this IQueryable<TEntity> query) where TEntity : class
        {
            var enumerator = query.Provider.Execute<IEnumerable<TEntity>>(query.Expression).GetEnumerator();
            var relationalCommandCache = enumerator.Private("_relationalCommandCache");
            var selectExpression = relationalCommandCache.Private<SelectExpression>("_selectExpression");
            var factory = relationalCommandCache.Private<IQuerySqlGeneratorFactory>("_querySqlGeneratorFactory");

            var sqlGenerator = factory.Create();
            var command = sqlGenerator.GetCommand(selectExpression);

            string sql = command.CommandText;
            return sql;
        }

        private static object Private(this object obj, string privateField) => obj?.GetType().GetField(privateField, BindingFlags.Instance | BindingFlags.NonPublic)?.GetValue(obj);
        private static T Private<T>(this object obj, string privateField) => (T)obj?.GetType().GetField(privateField, BindingFlags.Instance | BindingFlags.NonPublic)?.GetValue(obj);
    }

The usage is simple

Suppose we’ve got the following inputs: One simple table, that we’d like to group by one field and total by another. Database Context is also pretty much boilerplate. One thing to note here is a couple of database providers we are going to try the query against.

public class SomeTable
{
    public int Id { get; set; }
    public int Foobar { get; set; }
    public int Quantity { get; set; }
}

class MyDbContext : DbContext
{
    public DbSet<SomeTable> SomeTables { get; set; }
    public static readonly LoggerFactory DbCommandConsoleLoggerFactory
        = new LoggerFactory(new[] {
        new ConsoleLoggerProvider ((category, level) =>
            category == DbLoggerCategory.Database.Command.Name &&
            level == LogLevel.Trace, true)
        });
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        //run with SQL Server provider to get T-SQL
        optionsBuilder.UseNpgsql("Server=localhost;Port=5432;Database=test;User Id=;Password=;")
        //alternatively use other supported provider
        //optionsBuilder.UseSqlServer("Server=.\\SQLEXPRESS;Database=test;Trusted_Connection=true")
        ;
        base.OnConfiguring(optionsBuilder);
    }
}

The test bench would look something like so

class Program
{
    static void Main(string[] args)
    {

        var context = new MyDbContext();
        var someTableData = context.SomeTables
                .GroupBy(x => x.Foobar)
                .Select(x => new { Foobar = x.Key, Quantity = x.Sum(y => y.Quantity) })
                .OrderByDescending(x => x.Quantity)
                .Take(10) // we've built our query as per normal
                .ToSql(); // this is the magic
        Console.Write(someTableData);
        Console.ReadKey();
    }
}

And depending on our choice of provider the output would show ef core generated sql for SQL Server and Postgres

        -- MSSQL
        SELECT TOP(@__p_0) [s].[Foobar], SUM([s].[Quantity]) AS [Quantity]
        FROM [SomeTables] AS [s]
        GROUP BY [s].[Foobar]
        ORDER BY SUM([s].[Quantity]) DESC

        -- PG SQL
         SELECT s."Foobar", SUM(s."Quantity")::INT AS "Quantity"
        FROM "SomeTables" AS s
        GROUP BY s."Foobar"
        ORDER BY SUM(s."Quantity")::INT DESC
        LIMIT @__p_0

Entity Framework Core 3 – Custom Functions (Using IMethodCallTranslator)

Every now and then Stack Overflow provides fantastic opportunities to learn something new. One user asked whether SQL Server’s DECRYPTBYPASSPHRASE can be implemented with Entity Framework Core 2.2 so they can fetch encrypted strings in SQL.

Continue reading “Entity Framework Core 3 – Custom Functions (Using IMethodCallTranslator)”

EF Core 3: Getting model metadata from dynamically loaded assembly with IL Emit

Yet another Stack Overflow question has sparked a heated discussion and got us thinking whether we can do better.

In a nutshell, the question was about finding a way to query EF Core model metadata without directly referencing the assembly that defines it. Think MsBuild Task that needs to check if your model is following your company standards. Or a test of some sort.

First stab at it

We were able to help the OP by quickly whipping up the following loader code:

var assembly = Assembly.LoadFrom(@"C:\OnlineShoppingStore\bin\Debug\netcoreapp2.2\OnlineShoppingStore.dll");
var contextType = assembly.GetTypes().First(d => d.Name == "OnlineStoreDbContext");
var ctx = Activator.CreateInstance(contextType) as DbContext; // instantiate your context. this will effectively build your model, so you must have all required EF references in your project
var p = ctx.Model.FindEntityType(assembly.GetTypes().First(d => d.Name == "Product")); // get the type from loaded assembly
//var p = ctx.Model.FindEntityType("OnlineStoreDbContext.Product"); // querying model by type name also works, but you'd need to correctly qualify your type names
var pk = p.FindPrimaryKey().Properties.First().Name; // your PK property name as built by EF model

The answer ended up being accepted, but the OP had a bit of an issue with instantiating the Context:

System.InvalidOperationException: 'No database provider has been configured for this DbContext. 
A provider can be configured by overriding the DbContext.OnConfiguring method or by using AddDbContext on the application service provider. 
If AddDbContext is used, then also ensure that your DbContext type accepts a DbContextOptions object in its constructor and passes it to the base constructor for DbContext.

This is kind of expected: when EF creates the context it will invoke OnConfiguring override and set up DB provider with connection strings and so on and so forth. It all is necessary for the actual thing to run, but for the OP it meant having to drag all DB providers into the test harness. Not ideal.

The idea

After a bit back and forth I’ve got an idea. What if we subclass the Context yet again and override the OnConfiguring with a predefined Provider (say, InMemory)?

IL Emit all things

We don’t get to use IL Emit often – it’s meant for pretty specific use cases and I think this is one. The key to getting it right in our case was finding the correct overload of UseInMemoryDatabase. There’s a chance however, that you might need to tweak it to suit your needs. It is pretty trivial once you know what you’re looking for.

public static MethodBuilder OverrideOnConfiguring(this TypeBuilder tb)
        {
            MethodBuilder onConfiguringMethod = tb.DefineMethod("OnConfiguring",
                MethodAttributes.Public
                | MethodAttributes.HideBySig
                | MethodAttributes.NewSlot
                | MethodAttributes.Virtual,
                CallingConventions.HasThis,
                null,
                new[] { typeof(DbContextOptionsBuilder) });

            // the easiest method to pick will be .UseInMemoryDatabase(this DbContextOptionsBuilder optionsBuilder, string databaseName, Action<InMemoryDbContextOptionsBuilder> inMemoryOptionsAction = null)
            // but since constructing generic delegate seems a bit too much effort we'd rather filter everything else out
            var useInMemoryDatabaseMethodSignature = typeof(InMemoryDbContextOptionsExtensions)
                .GetMethods()
                .Where(m => m.Name == "UseInMemoryDatabase")
                .Where(m => m.GetParameters().Length == 3)
                .Where(m => m.GetParameters().Select(p => p.ParameterType).Contains(typeof(DbContextOptionsBuilder)))
                .Where(m => m.GetParameters().Select(p => p.ParameterType).Contains(typeof(string)))
                .Single();
            
            // emits the equivalent of optionsBuilder.UseInMemoryDatabase("test");
            var gen = onConfiguringMethod.GetILGenerator();
            gen.Emit(OpCodes.Ldarg_1);
            gen.Emit(OpCodes.Ldstr, Guid.NewGuid().ToString());
            gen.Emit(OpCodes.Ldnull);
            gen.Emit(OpCodes.Call, useInMemoryDatabaseMethodSignature);
            gen.Emit(OpCodes.Pop);
            gen.Emit(OpCodes.Ret);

            return onConfiguringMethod;
        }

with the above out of the way we now can build our dynamic type and plug it into our test harness!

class Program
    {
        static void Main(string[] args)
        {
            // load assembly under test
            var assembly = Assembly.LoadFrom(@"..\ef-metadata-query\OnlineShoppingStore\bin\Debug\netcoreapp3.1\OnlineShoppingStore.dll");
            var contextType = assembly.GetTypes().First(d => d.Name == "OnlineStoreDbContext");

            // create yet another assembly that will hold our dynamically generated type
            var typeBuilder = AssemblyBuilder
                                .DefineDynamicAssembly(new AssemblyName(Guid.NewGuid().ToString()), AssemblyBuilderAccess.RunAndCollect)
                                .DefineDynamicModule(Guid.NewGuid() + ".dll")
                                .DefineType("InheritedDbContext", TypeAttributes.Public, contextType); // make new type inherit from DbContext under test!

            // this is the key here! now our dummy implementation will kick in!
            var onConfiguringMethod = typeBuilder.OverrideOnConfiguring();
            typeBuilder.DefineMethodOverride(onConfiguringMethod, typeof(DbContext).GetMethod("OnConfiguring", BindingFlags.Instance | BindingFlags.NonPublic));
            
            var inheritedDbContext = typeBuilder.CreateType(); // enough config, let's get the type and roll with it

            // instantiate inheritedDbContext with default OnConfiguring implementation
            var context = Activator.CreateInstance(inheritedDbContext) as DbContext; // instantiate your context. this will effectively build your model, so you must have all required EF references in your project
            var p = context?.Model.FindEntityType(assembly.GetTypes().First(d => d.Name == "Product")); // get the type from loaded assembly
            
            //query the as-built model
            //var p = ctx.Model.FindEntityType("OnlineStoreDbContext.Product"); // querying model by type name also works, but you'd need to correctly qualify your type names
            var pk = p.FindPrimaryKey().Properties.First().Name; // your PK property name as built by EF model
            
            Console.WriteLine(pk);
        }
    }

This is runnable

Source code is available on GitHub in case you want to check it out and play a bit