EF core 3.1: dynamic GroupBy clause

Expanding on many ways we can build dynamic clauses for EF execution, let us look at another example. Suppose we’d like to give our users ability to group items based on a property of their choice.

As usual, our first choice would be to build a LINQ expression as we did with the WHERE case. There’s however a problem: GroupBy needs an object to use as a grouping key. Usually, we’d just make an anonymous type, but it is a compile-time luxury we don’t get with LINQ:

dbSet.GroupBy(s => new {s.Col1, s.Col2}); // not going to fly :/

IL Emit it is then

So, it seems we are left with no other choice but to go through TypeBuilder ordeal (which isn’t too bad really). One thing to point out here – we want to create properties that EF will later use for grouping key. This is where ability to interrogate EF as-built model comes very handy. Another important point – creating a property in fact means creating a private backing field and two special methods for each. We certainly started to appreciate how much the language and runtime do for us:

private void CreateProperty(TypeBuilder typeBuilder, string propertyName, Type propertyType)
{
 // really, just generating "public PropertyType propertyName {get;set;}"
	FieldBuilder fieldBuilder = typeBuilder.DefineField("_" + propertyName, propertyType, FieldAttributes.Private);

	PropertyBuilder propertyBuilder = typeBuilder.DefineProperty(propertyName, PropertyAttributes.HasDefault, propertyType, null);
	MethodBuilder getPropMthdBldr = typeBuilder.DefineMethod("get_" + propertyName, MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig, propertyType, Type.EmptyTypes);
	ILGenerator getIl = getPropMthdBldr.GetILGenerator();

	getIl.Emit(OpCodes.Ldarg_0);
	getIl.Emit(OpCodes.Ldfld, fieldBuilder);
	getIl.Emit(OpCodes.Ret);

	MethodBuilder setPropMthdBldr = typeBuilder.DefineMethod("set_" + propertyName,
		  MethodAttributes.Public |
		  MethodAttributes.SpecialName |
		  MethodAttributes.HideBySig,
		  null, new[] { propertyType });

	ILGenerator setIl = setPropMthdBldr.GetILGenerator();
	Label modifyProperty = setIl.DefineLabel();
	Label exitSet = setIl.DefineLabel();

	setIl.MarkLabel(modifyProperty);
	setIl.Emit(OpCodes.Ldarg_0);
	setIl.Emit(OpCodes.Ldarg_1);
	setIl.Emit(OpCodes.Stfld, fieldBuilder);

	setIl.Emit(OpCodes.Nop);
	setIl.MarkLabel(exitSet);
	setIl.Emit(OpCodes.Ret);

	propertyBuilder.SetGetMethod(getPropMthdBldr);
	propertyBuilder.SetSetMethod(setPropMthdBldr);
}

after we’ve sorted this out – it’s pretty much the same approach as with any other dynamic LINQ expression: we need to build something like DbSet.GroupBy(s => new dynamicType {col1 = s.q1, col2 = s.q2}). There’s a slight issue with this lambda however – it returns IGrouping<dynamicType, TElement> – and since outside code has no idea of the dynamic type – there’s no easy way to work with it (unless we want to keep reflecting). I thought it might be easier to build a Select as well and return a Count against each instance of dynamic type. Luckily, we only needed a count, but other aggregations work in similar fashion.

Finally

I arrived at the following code to generate required expressions:

public static IQueryable<Tuple<object, int>> BuildExpression<TElement>(this IQueryable<TElement> source, DbContext context, List<string> columnNames)
{
	var entityParameter = Expression.Parameter(typeof(TElement));
	var sourceParameter = Expression.Parameter(typeof(IQueryable<TElement>));

	var model = context.Model.FindEntityType(typeof(TElement)); // start with our own entity
	var props = model.GetPropertyAccessors(entityParameter); // get all available field names including navigations			

	var objectProps = new List<Tuple<string, Type>>();
	var accessorProps = new List<Tuple<string, Expression>>();
	var groupKeyDictionary = new Dictionary<object, string>();
	foreach (var prop in props.Where(p => columnNames.Contains(p.Item3)))
	{
		var propName = prop.Item3.Replace(".", "_"); // we need some form of cross-reference, this seems to be good enough
		objectProps.Add(new Tuple<string, Type>(propName, (prop.Item2 as MemberExpression).Type));
		accessorProps.Add(new Tuple<string, Expression>(propName, prop.Item2));
	}

	var groupingType = BuildGroupingType(objectProps); // build new type we'll use for grouping. think `new Test() { A=, B=, C= }`

	// finally, we're ready to build our expressions
	var groupbyCall = BuildGroupBy<TElement>(sourceParameter, entityParameter, accessorProps, groupingType); // source.GroupBy(s => new Test(A = s.Field1, B = s.Field2 ... ))
	var selectCall = groupbyCall.BuildSelect<TElement>(groupingType); // .Select(g => new Tuple<object, int> (g.Key, g.Count()))
	
	var lambda = Expression.Lambda<Func<IQueryable<TElement>, IQueryable<Tuple<object, int>>>>(selectCall, sourceParameter);
	return lambda.Compile()(source);
}

private static MethodCallExpression BuildSelect<TElement>(this MethodCallExpression groupbyCall, Type groupingAnonType) 
{	
	var groupingType = typeof(IGrouping<,>).MakeGenericType(groupingAnonType, typeof(TElement));
	var selectMethod = QueryableMethods.Select.MakeGenericMethod(groupingType, typeof(Tuple<object, int>));
	var resultParameter = Expression.Parameter(groupingType);

	var countCall = BuildCount<TElement>(resultParameter);
	var resultSelector = Expression.New(typeof(Tuple<object, int>).GetConstructors().First(), Expression.PropertyOrField(resultParameter, "Key"), countCall);

	return Expression.Call(selectMethod, groupbyCall, Expression.Lambda(resultSelector, resultParameter));
}

private static MethodCallExpression BuildGroupBy<TElement>(ParameterExpression sourceParameter, ParameterExpression entityParameter, List<Tuple<string, Expression>> accessorProps, Type groupingAnonType) 
{
	var groupByMethod = QueryableMethods.GroupByWithKeySelector.MakeGenericMethod(typeof(TElement), groupingAnonType);
	var groupBySelector = Expression.Lambda(Expression.MemberInit(Expression.New(groupingAnonType.GetConstructors().First()),
			accessorProps.Select(op => Expression.Bind(groupingAnonType.GetMember(op.Item1)[0], op.Item2))
		), entityParameter);

	return Expression.Call(groupByMethod, sourceParameter, groupBySelector);
}

private static MethodCallExpression BuildCount<TElement>(ParameterExpression resultParameter)
{
	var asQueryableMethod = QueryableMethods.AsQueryable.MakeGenericMethod(typeof(TElement));
	var countMethod = QueryableMethods.CountWithoutPredicate.MakeGenericMethod(typeof(TElement));

	return Expression.Call(countMethod, Expression.Call(asQueryableMethod, resultParameter));
}

And the full working version is on my GitHub