EF Core 6 – Handling JSON data in SQL Server

We have seen a few projects where the client insists on using SQL Server, but their workload generates a bunch of semi-structured JSON data for us to store. Now we need to bridge the gap and efficiently query it. One realistic case study where this situation may pop up would be building a browser extension that helps capture and organise wish lists from various platforms such as Amazon or eBay.

Let’s build a wish list

Our journey would likely start with a web browser. We may have a browser extension of some sort that would send information about an item we’re currently watching to an API. That API will have to make sense of all different attributes each website provides and try store it in a way that makes querying as easy and efficient as possible.

The catch is, we don’t know full list of attributes. And, probably, we should not care – perfect data model for frequently changing e-commerce websites would be hard to deduce and likely very painful to maintain. We could just lump it all together and chuck it into a schema-less data store like Cosmos DB or Mongo, but for the sake of argument we’ll imagine the client was adamant they needed a SQL Server because it’s a business decision they made a long time ago.

Not a huge drama

SQL Server has functions to work with JSON data – all we have to do is put it into a sufficiently big column. We can save heaps of time creating one simple data model and calling it a day.

Moving on to coding, let us focus on data retrieval and skip the API part for now – it’s a whole different can or worms and will distract us from our point. From here on we’ll be exploring ways to query the data

Test Bench

We’ve created a small EF Core project where DBContext looks something like this:

class SHDbContext : DbContext
{
	public DbSet<WishlistEntry> WishlistEntries { get; set; }

	protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
	{
		base.OnConfiguring(optionsBuilder);
		optionsBuilder.UseSqlServer(Environment.GetEnvironmentVariable("DB_CONNSTR")).LogTo(Console.WriteLine);
	}
}

where WishlistEntry is defined like so:

class WishlistEntry
{

	public int Id { get; set; }
	public int CustomerId { get; set; }
	[MaxLength(300)]
	public string ItemSource { get; set; }
	public string ItemData { get; set; }

	[DatabaseGenerated(DatabaseGeneratedOption.Computed)]
	public string ItemCategory { get; set; }
}

class ItemData
{
	public string Sku { get; set; }
	public string Name { get; set; }
	public List<Category> Category { get; set; }
}

class Category
{
	public string Id { get; set; }
	public string Name { get; set; }
}

with setup out of the way, let us define a sample entry and write a query to pick all entries that contain a category “Connected Home & Housewares”:

{
  "sku": 185267,
  "name": "Duracell - D Batteries (4-Pack)",
  "type": "HardGood",
  "price": 9.99,
  "upc": "041333430010",
  "category": [
    {
      "id": "pcmcat312300050015",
      "name": "Connected Home & Housewares"
    },
    {
      "id": "pcmcat248700050021",
      "name": "Housewares"
    },
    {
      "id": "pcmcat303600050001",
      "name": "Household Batteries"
    },
    {
      "id": "abcat0208002",
      "name": "Alkaline Batteries"
    }
  ],
  "shipping": 5.99,
  "description": "Compatible with select electronic devices; D size; DURALOCK Power Preserve technology; 4-pack",
  "manufacturer": "Duracell",
  "model": "MN1300R4Z",
  "url": "http://www.bestbuy.com/site/duracell-d-batteries-4-pack/185267.p?id=1051384046551&skuId=185267&cmp=RMXCC",
  "image": "http://img.bbystatic.com/BestBuy_US/images/products/1852/185267_sa.jpg"
}

Option one – JsonConvert.Deserialize

The naive approach to the desired outcome would be to simply try and deserialise the field and pray Entity Framework would work its magic:

var context = new SHDbContext();
//var sql = context.WishlistEntries.Select(wle => JsonConvert.DeserializeObject<ItemData>(wle.ItemData)
//                                                .Category.Any(c => c.Name == "Connected Home & Housewares")); //The LINQ expression 'c => c.Name == "Connected Home & Housewares"' could not be translated. 

Unfortunately, this would not even compile as EF has no idea what to do with JsonConvert. We’ll end up having to materialise the query and handle filtering on the client side:

var data = context.WishlistEntries.Select(wle => JsonConvert.DeserializeObject<ItemData>(wle.ItemData))
                                .AsEnumerable()
                                .Where(d => d.Category.Any(c => c.Name == "Connected Home & Housewares"))
                                .ToList();

This will work as expected, and some developers will probably stop here, but this is in fact a time bomb. When our wish lists get big enough, the clients will put a serious strain on SQL Server and the network fetching the whole table in its entirety. Not nice.

Option two – write a Value Converter

Technically this is a variation of the previous option but leveraging EF Core Value Conversion capability for nicer looking query. To enable this, we’ll override OnModelCreating and add a few more lines to DB Context:

class SHDbContext : DbContext
{
	public DbSet<WishlistEntry> WishlistEntries { get; set; }

	protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
	{
		base.OnConfiguring(optionsBuilder);
		optionsBuilder.UseSqlServer(Environment.GetEnvironmentVariable("DB_CONNSTR")).LogTo(Console.WriteLine);
	}

	protected override void OnModelCreating(ModelBuilder modelBuilder)
	{
		base.OnModelCreating(modelBuilder);
		modelBuilder.Entity<WishlistEntry>()
		            .Property(e => e.ItemData)
		            .HasConversion((itemData) => JsonConvert.SerializeObject(itemData), str => JsonConvert.DeserializeObject<ItemData>(str));
	}
}
// ...
// finally, use the new converter like so:
var data = context.WishlistEntries.AsEnumerable().Where(d => d.ItemData.Category.Any(c => c.Name == "Connected Home & Housewares")).ToList();
 // also not ideal as we still have to filter on client side, effectively scanning all tabl

This simply shifts the JSON conversion code up the food chain, so developers won’t see it often. Unfortunately, this “out of sight – out of mind” approach won’t improve our query plan.

Option three – JSON_VALUE function

It gets pretty clear at this stage that we need to be able to filter by json fields on SQL Server side. Luckily, Microsoft provides a few json-handling capabilities. JSON_VALUE function that allows extracting one value out of the blob via given path selector. This is going to eliminate the need to land our query and should hugely improve performance. Entity Framework does not support this function by default, but it should be very easy to add:

// in the context class
 // adding static function to the DB Context
[DbFunction("JSON_VALUE", IsBuiltIn = true, IsNullable = false)]
public static string JsonValue(string expression, string path) => throw new NotImplementedException();

our query then becomes something like this:

var data = context.WishlistEntries.Where(d => SHDbContext.JsonValue(d.ItemData, "$.category[0].name").Contains("Connected Home & Housewares")).ToList();

And this is the correct answer. However, we can further improve performance by adding an index. Indexing the whole json blob will not help, but we can create a computed column with the value we care about and index that instead:

-- add computed column
ALTER TABLE WishlistEntries ADD ItemCategory AS JSON_VALUE(ItemData, '$.category[0].name')
-- index the value
CREATE INDEX IX_ItemCategory ON WishlistEntries(ItemCategory) INCLUDE(CustomerId, ItemSource, ItemData);
-- finally, test if the query picks up index by running the following and checking the Actual Execution Plan:
/* test queries */
SELECT * FROM [WishlistEntries] WHERE JSON_VALUE(ItemData, '$.category[0].name') = 'Connected Home & Housewares'

This enhancement also happens to simplify our code a bit as we can now rely on computed column value when writing queries in EF:

// add computed column to WishlistEntry
class WishlistEntry
{

	public int Id { get; set; }
	public int CustomerId { get; set; }
	[MaxLength(300)]
	public string ItemSource { get; set; }
	public string ItemData { get; set; } 

	[DatabaseGenerated(DatabaseGeneratedOption.Computed)]
	public string ItemCategory { get; set; }
}

// finally the query will be like:
var data = context.WishlistEntries.Where(wle => wle.ItemCategory == "Connected Home & Housewares").ToList();

Conclusion and area for improvement

Keen eyed readers may have noticed that we hardcoded query to first array item. This was a deliberate simplification we opted for to illustrate the point. It should be possible to query an array using OPENJSON … CROSS APPLY like so, but we haven’t explored this avenue far enough to see if we’d ultimately succeed:

var data = context.WishlistEntries.FromSqlRaw(@"SELECT TOP 1 wishlist.*
      FROM [WishlistEntries] as wishlist
CROSS APPLY OPENJSON(ItemData, '$.category') WITH (CatName VARCHAR(200) '$.name') as categories
WHERE categories.CatName = {0}", "Connected Home & Housewares").ToList();

Even though our application is far from complete we are in a good place. Equipped with these techniques, we can handle almost anything users throw at us and are still able to query data fast enough. The code for this exercise is available on GitHub.

Writing Prometheus exporters – the Lazy Dev way

This article is part of the C# Advent Series. Christmas has a special place in our hearts and this event is also a wonderful way to help build up the C# community. Do check out awesome content from other authors!

There’s a couple of things about Christmas in Southern Hemisphere that tends to hit us pretty hard each year: first, the fact that it is summer and it’s scorching hot outside. And second – is a customary closedown of almost all businesses (which probably started as response to the first point). Some businesses, however, keep boxing on.

One of our clients is into cryptocurrency mining and they could not care less about staff wanting time off to spend with family. Their only workforce are GPUs, and these devices can work 24/7. However, with temperatures creeping up, efficiency takes a hit. Also, other sad things can happen:

Solution design

Our first suggestion was to use our trusty ELK+G and poll extra data from NVIDIA SMI tool, but we soon figured out that this problem has already been solved for us. Mining software nowadays got extremely sophisticated (and obfuscated) – it now comes with own webserver and API. So, we simplified a bit:

All we have to do here would be to stand up an exporter and set up a few dashboards. Easy.

Hosted Services

We essentially need to run two services: poll underlying API and expose metrics in Prometheus-friendly format. We felt .NET Core Generic host infrastructure would fit very well here. It allows us to bootstrap an app, add Hosted Services and leave plumbing to Docker. The program ended up looking like so:

class Program
    {
        private static async Task Main(string[] args)
        {
            using IHost host = CreatHostBuilder(args).Build();
            
            await host.RunAsync();
        }
        static IHostBuilder CreateHostBuilder(string[] args) =>
            Host.CreateDefaultBuilder(args)
                .ConfigureAppConfiguration((configuration) =>
                {
                    configuration.AddEnvironmentVariables("TREX")
; // can add more sources such as command line
                })
                .ConfigureServices(c =>
                {
                    c.AddSingleton<MetricCollection>(); // This is where we will keep all metrics state. hence singleton
                    c.AddHostedService<PrometheusExporter>(); // exposes MetricCollection
                    c.AddHostedService<TRexPoller>(); // periodically GETs status and updates MetricCollection
                });
    }

Defining services

The two parts of our applicatgion are TRexPoller and PrometheusExporter. Writing both is trivial and we won’t spend much time on the code there. Feel free to check it out on GitHub. The point to make here is it has never been easier to focus on business logic and leave heavy lifting to respective NuGet packages.

Crafting the models

The most important part of our application is of course telemetry. We grabbed a sample json response from the API and used an online tool to convert that into C# classes:

// generated code looks like this. A set of POCOs with each property decorated with JsonProperty that maps to api response
public partial class Gpu
{
    [JsonProperty("device_id")]
    public int DeviceId { get; set; }
    [JsonProperty("hashrate")]
    public int Hashrate { get; set; }
    [JsonProperty("hashrate_day")]
    public int HashrateDay { get; set; }
    [JsonProperty("hashrate_hour")]
    public int HashrateHour { get; set; }
...
}

Now we need to define metrics that Prometheus.Net can later discover and serve up:

// example taken from https://github.com/prometheus-net/prometheus-net#quick-start
private static readonly Counter ProcessedJobCount = Metrics
    .CreateCounter("myapp_jobs_processed_total", "Number of processed jobs.");
...
ProcessJob();
ProcessedJobCount.Inc();

Turning on lazy mode

This is where we’ve got so inspired by our “low code” solution that we didn’t want to get down to hand-crafting a bunch of class fields to describe every single value the API serves. Luckily, C#9 has a new feature just for us: Source Code Generators to the rescue! We’ve covered the basic setup before, so we’ll skip this part here and move on to the Christmas magic part.

Let Code Generators do the work for us

Before we hand everything over to robots, we need to set some basic rules to control the process. Custom attributes looked like a sensible way to keep all configuration local with the model POCOs:

[AddInstrumentation("gpus")] // the first attribute prompts the generator to loop through the properties and search for metrics 
public partial class Gpu
{
    [JsonProperty("device_id")]
    public int DeviceId { get; set; }
    [JsonProperty("hashrate")]
    /*
     * the second attribute controls which type the metric will have as well as what labels we want to store with it.
     * In this example, it's a Gauge with gpu_id, vendor and name being labels for grouping in Prometheus
     */
    [Metric("Gauge", "gpu_id", "vendor", "name")]
    public int Hashrate { get; set; }
    [JsonProperty("hashrate_day")]
    [Metric("Gauge", "gpu_id", "vendor", "name")]
    public int HashrateDay { get; set; }
    [JsonProperty("hashrate_hour")]
    [Metric("Gauge", "gpu_id", "vendor", "name")]
    public int HashrateHour { get; set; }

Finally, the generator itself hooks into ClassDeclarationSyntax and looks for well-known attributes:

public void OnVisitSyntaxNode(SyntaxNode syntaxNode)
    {
        if (syntaxNode is ClassDeclarationSyntax cds && cds.AttributeLists
            .SelectMany(al => al.Attributes)
            .Any(a => (a.Name as IdentifierNameSyntax)?.Identifier.ValueText == "AddInstrumentation"))
        {
            ClassesToProcess.Add(cds);
        }
    }

Once we’ve got our list, we loop through each property and generate a dictionary of Collector objects.

var text = new StringBuilder(@"public static Dictionary<string, Collector> GetMetrics(string prefix)
    {
        var result = new Dictionary<string, Collector>
        {").AppendLine();
foreach (PropertyDeclarationSyntax p in properties)
{
    var jsonPropertyAttr = p.GetAttr("JsonProperty");
    var metricAttr = p.GetAttr("Metric");
    if (metricAttr == null) continue;
    var propName = jsonPropertyAttr.GetFirstParameterValue();
    var metricName = metricAttr.GetFirstParameterValue(); // determine metric type
    if (metricAttr.ArgumentList.Arguments.Count > 1)
    {
        var labels = metricAttr.GetTailParameterValues(); // if we have extra labels to process - here's our chance 
        text.AppendLine(
            $"{{$\"{{prefix}}{attrPrefix}_{propName}\", Metrics.Create{metricName}($\"{{prefix}}{attrPrefix}_{propName}\", \"{propName}\", {commonLabels}, {labels}) }},");
    }
    else
    {
        text.AppendLine(
            $"{{$\"{{prefix}}{attrPrefix}_{propName}\", Metrics.Create{metricName}($\"{{prefix}}{attrPrefix}_{propName}\", \"{propName}\", {commonLabels}) }},");
    }
}
text.AppendLine(@"};
                return result;
            }");

In parallel to defining storage for metrics, we also need to generate code that will update values as soon as we’ve heard back from the API:

private StringBuilder UpdateMetrics(List<MemberDeclarationSyntax> properties, SyntaxToken classToProcess, string attrPrefix)
{
    var text = new StringBuilder($"public static void UpdateMetrics(string prefix, Dictionary<string, Collector> metrics, {classToProcess} data, string host, string slot, string algo, List<string> extraLabels = null) {{");
    text.AppendLine();
    text.AppendLine(@"if(extraLabels == null) { 
                            extraLabels = new List<string> {host, slot, algo};
                        }
                        else {
                            extraLabels.Insert(0, algo);
                            extraLabels.Insert(0, slot);
                            extraLabels.Insert(0, host);
                        }");
    foreach (PropertyDeclarationSyntax p in properties)
    {
        var jsonPropertyAttr = p.GetAttr("JsonProperty");
        var metricAttr = p.GetAttr("Metric");
        if (metricAttr == null) continue;
        var propName = jsonPropertyAttr.GetFirstParameterValue();
        var metricName = metricAttr.GetFirstParameterValue();
        var newValue = $"data.{p.Identifier.ValueText}";
        text.Append(
            $"(metrics[$\"{{prefix}}{attrPrefix}_{propName}\"] as {metricName}).WithLabels(extraLabels.ToArray())");
        switch (metricName)
        {
            case "Counter": text.AppendLine($".IncTo({newValue});"); break;
            case "Gauge": text.AppendLine($".Set({newValue});"); break;
        }
    }
    text.AppendLine("}").AppendLine();
    return text;
}

Bringing it all together with MetricCollection

Finally, we can use the generated code to bootstrap metrics on per-model basis and ensure we correctly handle updates:

internal class MetricCollection
{
    private readonly Dictionary<string, Collector> _metrics;
    private readonly string _prefix;
    private readonly string _host;
    public MetricCollection(IConfiguration configuration)
    {
        _prefix = configuration.GetValue<string>("exporterPrefix", "trex");
        _metrics = new Dictionary<string, Collector>();
        // this is where declaring particl classes and generating extra methods makes for seamless development experience
        foreach (var (key, value) in TRexResponse.GetMetrics(_prefix)) _metrics.Add(key, value);
        foreach (var (key, value) in DualStat.GetMetrics(_prefix)) _metrics.Add(key, value);
        foreach (var (key, value) in Gpu.GetMetrics(_prefix)) _metrics.Add(key, value);
        foreach (var (key, value) in Shares.GetMetrics(_prefix)) _metrics.Add(key, value);
    }
    public void Update(TRexResponse data)
    {
        TRexResponse.UpdateMetrics(_prefix, _metrics, data, _host, "main", data.Algorithm);
        DualStat.UpdateMetrics(_prefix, _metrics, data.DualStat, _host, "dual", data.DualStat.Algorithm);
        
        foreach (var dataGpu in data.Gpus)
        {
            Gpu.UpdateMetrics(_prefix, _metrics, dataGpu, _host, "main", data.Algorithm, new List<string>
            {
                dataGpu.DeviceId.ToString(),
                dataGpu.Vendor,
                dataGpu.Name
            });
            Shares.UpdateMetrics(_prefix, _metrics, dataGpu.Shares, _host, "main", data.Algorithm, new List<string>
            {
                dataGpu.GpuId.ToString(),
                dataGpu.Vendor,
                dataGpu.Name
            });
        }
    }
}

Peeking into generated code

Just to make sure we’re on the right track, we looked at generated code. It ain’t pretty but it’s honest work:

public partial class Shares {
public static Dictionary<string, Collector> GetMetrics(string prefix)
                {
                    var result = new Dictionary<string, Collector>
                    {
{$"{prefix}_shares_accepted_count", Metrics.CreateCounter($"{prefix}_shares_accepted_count", "accepted_count", "host", "slot", "algo", "gpu_id", "vendor", "name") },
{$"{prefix}_shares_invalid_count", Metrics.CreateCounter($"{prefix}_shares_invalid_count", "invalid_count", "host", "slot", "algo", "gpu_id", "vendor", "name") },
{$"{prefix}_shares_last_share_diff", Metrics.CreateGauge($"{prefix}_shares_last_share_diff", "last_share_diff", "host", "slot", "algo", "gpu_id", "vendor", "name") },
...
};
                            return result;
                        }
public static void UpdateMetrics(string prefix, Dictionary<string, Collector> metrics, Shares data, string host, string slot, string algo, List<string> extraLabels = null) {
if(extraLabels == null) { 
                                    extraLabels = new List<string> {host, slot, algo};
                                }
                                else {
                                    extraLabels.Insert(0, algo);
                                    extraLabels.Insert(0, slot);
                                    extraLabels.Insert(0, host);
                                }
(metrics[$"{prefix}_shares_accepted_count"] as Counter).WithLabels(extraLabels.ToArray()).IncTo(data.AcceptedCount);
(metrics[$"{prefix}_shares_invalid_count"] as Counter).WithLabels(extraLabels.ToArray()).IncTo(data.InvalidCount);
(metrics[$"{prefix}_shares_last_share_diff"] as Gauge).WithLabels(extraLabels.ToArray()).Set(data.LastShareDiff);
...
}
}

Conclusion

This example barely scratches the surface of what’s possible with this feature. Source code generators are extremely helpful when we deal with tedious and repetitive development tasks. It also helps reduce maintenance overheads by enabling us to switch to declarative approach. I’m sure we will see more projects coming up where this feature will become central to the solution.

If not already, do check out the source code in GitHub. And as for us, we would like to sign off with warmest greetings of this festive season and best wishes for happiness in the New Year.

ASP.NET XML serialisation issues: observations on DataContractSerializer

A client reached out to us with a weird problem. Their ASP.NET WebAPI project (somewhat legacy tech) needed to communicate with an application running on a mainframe (dinosaur-grade legacy tech). But they were having XML serialisation issues…

They had a test XML payload, but only half of that kept coming across the wire:

broken serialisation – missing fields

first thing we suspected was missing DataContract/DataMember attributes, but everything seemed to be okay:

[DataContract(Name = "ComplexObject", Namespace = "")]
public class ComplexObject
{
    [DataMember]
    public int Id { get; set; }
    [DataMember]
    public string Name { get; set; }
    [DataMember]
    public string Location { get; set; }
    [DataMember]
    public string Reference { get; set; }
    [DataMember]
    public double Rate { get; set; }
    [DataMember]
    public double DiscountedRate { get; set; }
}

After scratching our heads for a little while and trying different solution from across the entirety of StackOverflow, we managed to dig up a piece of documentation that explained this behaviour:

  1. Data members of base classes (assuming serialiser will apply these rules upwards recursively);
  2. Data members in alphabetical order (bingo!);
  3. Data members specifically numbered in decorating attribute;

With the above in mind, we got the following payload to serialise successfully:

successful serialisation

There are other options

.NET comes with at least two XML serialisers: XmlSerializer and DataContractSerializer. A lot has been written about the two. We find this article written by Dan Rigsby to probably be the best source of information on the topic.

Key difference for us was the fact that XmlSerializer does not require any decorations and works out of the box. While DataContractSerializer needs us to make code changes. In our project everything was already set up with DataContract, so we did not have to change anything.

By default, WebAPI projects come configured to leverage DataContractSerializer. It however pays to know that in case of any issues we can switch to use XMLSerializer:

public static class WebApiConfig
{
    public static void Register(HttpConfiguration config)
    {
        config.Formatters.XmlFormatter.UseXmlSerializer = true; // global setting for all types
        config.Formatters.XmlFormatter.SetSerializer<ComplexObject>(new XmlSerializer(typeof(ComplexObject))); // overriding just for one type

Setting order

Yet another option to deal with ASP.NET XML serialisation issues would be to define property order explicitly:

[DataContract(Name = "ComplexObject", Namespace = "")]
public class ComplexObject
{
    [DataMember(Order = 1)]
    public int Id { get; set; }
    [DataMember(Order = 2)]
    public string Name { get; set; }
    [DataMember(Order = 3)]
    public string Location { get; set; }
    [DataMember(Order = 4)]
    public string Reference { get; set; }
    [DataMember(Order = 5)]
    public double Rate { get; set; }
    [DataMember(Order = 6)]
    public double DiscountedRate { get; set; }
}

Conclusion

XML serialisation has been around since the beginning of .NET. And even though it may seem that JSON has taken over, XML isn’t going anywhere any time soon. It is good to know we have many ways to deal with it should we ever need to.

EF Core 6 – custom functions with DbFunction Attribute

We’ve already looked at way to implement SQL functions via method translation. That went reasonably well, but next time we had to do something similar we discovered that our code is broken with newer versions of EF Core. We fixed it again.

Not anymore

Looking through changelogs, we noticed that EF Core 2.0 came with support for mapping scalar functions. It is remarkably simple to set up:

public class MyDbContext : DbContext
    {

        [DbFunction("DECRYPTBYPASSPHRASE", IsBuiltIn = true, IsNullable = false)]
        public static byte[] DecryptByPassphrase(string pass, byte[] ciphertext) => throw new NotImplementedException();

        [DbFunction("DECRYPTBYKEY", IsBuiltIn = true, IsNullable = false)]
        public static byte[] DecryptByKey(byte[] ciphertext) => throw new NotImplementedException();
...

and even easier to use:

var filteredSet = Set
                .Select(m => new Model
                {
                    Id = m.Id,
                    Decrypted = MyDbContext.DecryptByPassphrase("TestPassword", m.Encrypted).ToString(),
                    Decrypted2 = MyDbContext.DecryptByKey(m.Encrypted2).ToString(), // since the key's opened for session scope - just relying on it should do the trick
                }).ToList();

Initially the attribute was offering limited configuration options, but starting EF Core 5.0, this is not an issue.

One gotcha with DECRYPT* functions is they return varbinary. Trying to use our own EF.Functions.ConvertToVarChar is not going to work since we disabled custom plugins. We want to get rid of this code after all. But Apparently .ToString() works as intended:

SELECT [m].[Id], CONVERT(varchar(100), DECRYPTBYPASSPHRASE(N'TestPassword', [m].[Encrypted])) AS [Decrypted], CONVERT(varchar(100), DECRYPTBYKEY([m].[Encrypted2])) AS [Decrypted2], [t].[Id], [t].[IsSomething], [m].[Encrypted], [m].[Encrypted2]...

Full example source is in GitHub, along with other takes we decided to leave in place for history.

Conclusion

Defining custom EF functions was one of the biggest articles we wrote here. And finding out how to fit it together probably was the most challenging and time-consuming project we undertook in recorded history. One can say we totally wasted our time, but I’d like to draw a different conclusion. We had fun, learned something new and were able to appreciate the complexity behind Entity Framework – it is not just an engineering marvel – it is also a magical beast!

Azure Functions – OpenAPI + EF Core = 💥

Creating Swagger-enabled Azure Functions is not that hard to do. Visual Studio literally comes with a template for that:

Inspecting the newly created project we see that it comes down to one NuGet package. It magically hooks into IWebJobsStartup and registers additional routes for Swagger UI and OpenAPI document. When run, it reflects upon suitable entry points in the assembly and builds required responses on the fly. Elegant indeed.

Installing Entity Framework

Now, suppose, we need to talk to Azure SQL. So, we’d like to add EF Core to the mix. As much as we love to go for latest and greatest, unfortunately it’s a bit messy at the moment. Instead let’s get a bit more conservative and stick to EFCore 3.1.

We did not expect that, did we?

The error message is pretty clear: the assembly somehow did not get copied to the output location. And indeed, the file was missing:

Apparently when VS builds the function, it makes a second copy of the libraries it thinks are required. And in our case, it decided it’s not picking up the dependency. Adding <_FunctionsSkipCleanOutput>true</_FunctionsSkipCleanOutput> to the project file will fix that:

Are there yet?

Probably, but there’s a catch: our deployment package just got bigger. Alternatively, we could downgrade EF Core to 3.1.13 which happens to use the same version of Microsoft.Extensions.Logging.Abstractions. This way we’d avoid having to hack project files at expense or limiting ourselves to an older version of EF Core. Ultimately, we hope OpenAPI extension picks up the slack and goes GA soon. For now, looks like we’ll have to stick to it.

EF Core 6 – does our IMethodCallTranslator still work?

A year and a half ago we posted an article on how we were able to plug into EF Core pipeline and inject our own IMethodCallTranslator. That let us leverage SQL-native encryption functionality with EF Core 3.1 and was ultimately a win. A lot has changed in the ecosystem, we’ve got .NET 5 and and .NET 6 coming up soon. So, we could not help but wonder…

Will it work with EF Core 6?

Apparently, EF6 is mostly an evolutionary step over EF5. That said, we totally missed previous version. So it is unclear to what extent the EF team has reworked their internal APIs. Most of the extensibility points we used were internal and clearly marked as “not for public consumption”. With that in mind, our concerns seemed valid.

Turns out, the code needs change…

The first issue we needed to rectify was implementing ShouldUseSameServiceProvider: from what I can tell, it’s needed to cache services more efficiently, but in our case setting it to default value seems to make sense.

But that’s where things really went sideways:

Apparently adding our custom IDbContextOptionsExtension resets the cache and by the time EF arrives at Model initialisation, instance of DI container gets wiped, leaving us with a bunch of null references (including the one above).

One line fix

I am still unsure why EF so upset when we add new extension. Stepping through the code would likely provide me with the answer but I feel it’s not worth the effort. Playing around with service scopes I however noticed that many built-in services get registered using different extension method with Scoped lifecycle. This prompted me to try change my registration method signature and voila:

And as usual, fully functioning code sits on GitHub.

Setting up Basic Auth for Swashbuckle

Let us put the keywords up front: Swashbuckle Basic authentication setup that works for Swashbuckle.WebApi installed on WebAPI projects (.NET 4.x). It’s also much less common to need Basic auth nowadays where proper mechanisms are easily available in all shapes and sizes. However, it may just become useful for internal tools or existing integrations.

Why?

Most tutorials online would cover the next gen library aimed specifically at ASP.Net Core, which is great (we’re all in for tech advancement), but sometimes legacy needs a prop up. If you are looking to add Basic Auth to ASP.NET Core project – look up AddSecurityDefinition and go from there. If legacy is indeed the in trouble – read on.

Setting it up

The prescribed solution is two steps:
1. add authentication scheme on SwaggerDocsConfig class
2. attach authentication to endpoints as required with IDocumentFilter or IOperationFilter

Assuming the installer left us with App_Start/SwaggerConfig.cs file, we’d arrive at:

public class SwaggerConfig
{
	public static void Register()
	{
		var thisAssembly = typeof(SwaggerConfig).Assembly;

		GlobalConfiguration.Configuration
			.EnableSwagger(c =>
				{
					...
					c.BasicAuth("basic").Description("Basic HTTP Authentication");
					c.OperationFilter<BasicAuthOpFilter>();
					...
				});
	}
}

and the most basic IOperationFilter would look like so:

public class BasicAuthOpFilter : IOperationFilter
{
	public void Apply(Operation operation, SchemaRegistry schemaRegistry, ApiDescription apiDescription)
	{
		operation.security = operation.security ?? new List<IDictionary<string, IEnumerable<string>>>();
		operation.security.Add(new Dictionary<string, IEnumerable<string>>()
							   {
								   { "basic", new List<string>() }
							   });
	}
}

Approaches to handling simple expressions in C#

Every now and then we get asked if there’s an easy way to parse user input into filter conditions. Say, for example, we have a viewmodel of type DataThing:

public class DataThing
 {
     public string Name;
     public float Value;
     public int Count;
 }

From here we’d like to check if a given property of this class satisfies a certain condition. For example we’ll look at “Value is greater than 15”. But of course we’d like to be flexible.

The issue

The main issue here is we don’t know the type of property before hand, so we can’t use generics even if we try to be smart:

public class DataThing
 {
     public string Name;
     public float Value;
     public int Count;
 }
 public static void Main()
 {
     var data = new DataThing() {Value=10, Name="test", Count = 1};
     var values = new List {
         new ValueGetter(x => x.Value),
         new ValueGetter(x => x.Name)
     };
     (values[0].Run(data) > 15).Dump();
 }
 public abstract class ValueGetter
 {
     public abstract T Run<T>(DataThing d);
 }
 public class ValueGetter<T> : ValueGetter
 {
     public Func<DataThing, T> TestFunc;
     public ValueGetter(Func<DataThing, T> blah)
     {
         TestFunc = blah;
     }
     public override T Run(DataThing d) => TestFunc.Invoke(d); // CS0029 Cannot implicitly convert type…
 }

Even if we figured it out it’s obviously way too dependant on DataThing layout to be used everywhere.

LINQ Expression trees

One way to solve this issue is with the help of LINQ expression trees. This way we wrap everything into one delegate with predictable signature and figure out types at runtime:

 bool BuildComparer(DataThing data, string field, string op, T value) {    
     var p1 = Expression.Parameter(typeof(DataThing));
     var p2 = Expression.Parameter(typeof(T));
     if (op == ">")
     {
         var expr = Expression.Lambda>(
             Expression.MakeBinary(ExpressionType.GreaterThan
                                 , Expression.PropertyOrField(p1, field)
                                 , Expression.Convert(p2, typeof(T))), p1, p2);
         var f = expr.Compile();
         return f(data, value);
      } 
      return false;
 }

Code DOM CSharpScript

Another way to approach the same problem is to generate C# code that we can compile and run .We’d need Microsoft.CodeAnalysis.CSharp.Scripting package for this to work:

bool BuildScript(DataThing data, string field, string op, T value)
 {
     var code = $"return {field} {op} {value};";
     var script = CSharpScript.Create(code, globalsType: typeof(DataThing), options: ScriptOptions.Default);
     var scriptRunner = script.CreateDelegate();
     return scriptRunner(data).Result;
 }

.NET 5 Code Generator

This is a new .NET 5 feature, that allows us to plug into compilation process and generate classes as we see fit. For example we’d generate extension methods that would all return correct values from DataThing:

[Generator] // see https://github.com/dotnet/roslyn/blob/main/docs/features/source-generators.cookbook.md for even more cool stuff
 class AccessorGenerator: ISourceGenerator {
     public void Execute(GeneratorExecutionContext context) {
       var syntaxReceiver = (CustomSyntaxReceiver) context.SyntaxReceiver;
       ClassDeclarationSyntax userClass = syntaxReceiver.ClassToAugment;
       SourceText sourceText = SourceText.From($ @ "
         public static class DataThingExtensions {
           { 
           // This is where we'd reflect over type members and generate code dynamically. Following code is oversimplification
             public static string GetValue<string>(this DataThing d) => d.Name;
             public static string GetValue<float>(this DataThing d) => d.Value;
             public static string GetValue<int>(this DataThing d) => d.Count;
           }
         }
         ", Encoding.UTF8);
         context.AddSource("DataThingExtensions.cs", sourceText);
       }
       public void Initialize(GeneratorInitializationContext context) {
         context.RegisterForSyntaxNotifications(() => new CustomSyntaxReceiver());
       }
       class CustomSyntaxReceiver: ISyntaxReceiver {
         public ClassDeclarationSyntax ClassToAugment {
           get;
           private set;
         }
         public void OnVisitSyntaxNode(SyntaxNode syntaxNode) {
           // Business logic to decide what we're interested in goes here
           if (syntaxNode is ClassDeclarationSyntax cds &&
             cds.Identifier.ValueText == "DataThing") {
             ClassToAugment = cds;
           }
         }
       }
     }

Running this should be as easy as calling extension methods on the class instance: data.GreaterThan(15f).Dump();

LINQ: Dynamic Join

Suppose we’ve got two lists that we would like to join based on, say, common Id property. With LINQ the code would look something along these lines:

var list1 = new List<MyItem> {};
var list2 = new List<MyItem> {};
var joined = list1.Join(list2, i => i.Id, j => j.Id, (k,l) => new {List1Item=k, List2Item=l});

resulting list of anonymous objects would have a property for each source object in the join. This is nothing new and has been documented pretty well.

But what if

We don’t know how many lists we’d have to join? Now we’ve got a list of lists of our entities (List Inception?!): List<List<MyItem>>. It becomes pretty obvious that we’d need to generate this code dynamically. We’ll run with LINQ Expression Trees – surely there’s a way. Generally speaking, we’ll have to build an object (anonymous type would be ideal) with fields like so:

{
  i0: items[0] // added on first run - we need to have at least two lists to join so it's safe to assume we'd
  i1: items[1] // added on first run - you need to have at least two lists in your join array 
  ... 
  iN: items[N] // added on each pass an joined with items[0] 
}

It is safe to assume that we need at least two lists for join to make sense, so we’d build the object above in two stages – first join two MyItem instances and get the structure going, Each subsequent join should append more MyItem instances to the resulting object until we’d get our result.

Picking types for the result

Now the problem is how we best define this object. The way anonymous types are declared, requires a type initialiser and a new keyword. We don’t have either of these at design time, so this method unfortunately will not work for us.

ExpandoObject

Another way to achieve decent developer experience with named object properties would be to use dynamic keyword – this is less than ideal as it effectively disables compiler static type checks. But we can keep going – so it’s an option here. To allow us to add properties at run time, we will use ExpandoObject:

static List<ExpandoObject> Join<TSource, TDest>(List<List<TSource>> items, Expression<Func<TSource, int>> srcAccessor, Expression<Func<ExpandoObject, int>> intermediaryAccessor, Expression<Func<TSource, TSource, ExpandoObject>> outerResultSelector)
{
	var joinLambdaType = typeof(ExpandoObject);            
	Expression<Func<ExpandoObject, TSource, ExpandoObject>> innerResultSelector = (expando, item) => expando.AddValue(item);
	
	var joinMethod = typeof(Enumerable).GetMethods().Where(m => m.Name == "Join").First().MakeGenericMethod(typeof(TSource), typeof(TSource), typeof(int), joinLambdaType);
	var toListMethod = typeof(Enumerable).GetMethods().Where(m => m.Name == "ToList").First().MakeGenericMethod(typeof(TDest));

	var joinCall = Expression.Call(joinMethod,
							Expression.Constant(items[0]),
							Expression.Constant(items[1]),
							srcAccessor,
							srcAccessor,
							outerResultSelector);
	joinMethod = typeof(Enumerable).GetMethods().Where(m => m.Name == "Join").First().MakeGenericMethod(typeof(TDest), typeof(TSource), typeof(int), joinLambdaType); // from now on we'll be joining ExpandoObject with MyEntity
	for (int i = 2; i < items.Count; i++) // skip the first two
	{
		joinCall =
			Expression.Call(joinMethod,
							joinCall,
							Expression.Constant(items[i]),
							intermediaryAccessor,
							srcAccessor,
							innerResultSelector);
	}

	var lambda = Expression.Lambda<Func<List<ExpandoObject>>>(Expression.Call(toListMethod, joinCall));
	return lambda.Compile()();
}

The above block references two extension methods so that we can easier manupulate the ExpandoObjects:

public static class Extensions 
 {
     public static ExpandoObject AddValue(this ExpandoObject expando, object value)
     {
         var dict = (IDictionary)expando;
         var key = $"i{dict.Count}"; // that was the easiest way to keep track of what's already in. You would probably find a way to do it better
         dict.Add(key, value);
         return expando;
     }
     public static ExpandoObject NewObject<T>(this ExpandoObject expando, T value1, T value2) 
     {
          var dict = (IDictionary<string, object>)expando;
          dict.Add("i0", value1);
          dict.Add("i1", value2);
          return expando; 
     }
 }

And with that, we should have no issue running a simple test like so:

class Program
{
    class MyEntity
    {
        public int Id { get; set; }
        public string Name { get; set; }

        public MyEntity(int id, string name)
        {
            Id = id; Name = name;
        }
    }

    static void Main()
    {
        List<List<MyEntity>> items = new List<List<MyEntity>> {
            new List<MyEntity> {new MyEntity(1,"test1_1"), new MyEntity(2,"test1_2")},
            new List<MyEntity> {new MyEntity(1,"test2_1"), new MyEntity(2,"test2_2")},
            new List<MyEntity> {new MyEntity(1,"test3_1"), new MyEntity(2,"test3_2")},
            new List<MyEntity> {new MyEntity(1,"test4_1"), new MyEntity(2,"test4_2")}
        };

        Expression<Func<MyEntity, MyEntity, ExpandoObject>> outerResultSelector = (i, j) => new ExpandoObject().NewObject(i, j); // we create a new ExpandoObject and populate it with first two items we join
        Expression<Func<ExpandoObject, int>> intermediaryAccessor = (expando) => ((MyEntity)((IDictionary<string, object>)expando)["i0"]).Id; // you could probably get rid of hardcoding this by, say, examining the first key in the dictionary
        
        dynamic cc = Join<MyEntity, ExpandoObject>(items, i => i.Id, intermediaryAccessor, outerResultSelector);

        var test1_1 = cc[0].i1;
        var test1_2 = cc[0].i2;

        var test2_1 = cc[1].i1;
        var test2_2 = cc[1].i2;
    }
}

ASP.Net Core – Resolving types from dynamic assemblies

It is not a secret that ASP.NET core comes with dependency injection support out of the box. And we don’t remember ever feeling it lacks features. All we have to do is register a type in Startup.cs and it’s ready to be consumed in our controllers:

public class Startup
{
    public void ConfigureServices(IServiceCollection services)
    {
        ...
        services.AddScoped<IDBLogger, IdbLogger>();
    }
}
//
public class HomeController : Controller
{
    private readonly IDBLogger _idbLogger;
    public HomeController(IDBLogger idbLogger)     
    {
        _idbLogger = idbLogger; // all good here!
    }
...
}

What if it’s a plugin?

Now imagine we’ve got a Order type that we for whatever strange reason load at runtime dynamically.

public class Order
{
     private readonly IDBLogger _logger; // suppose we've got the reference from common assembly that both our main application and this plugin are allowed to reference
     public Order(IDBLogger logger)
     {
         _logger = logger; // will it resolve?
     }
     public void GetOrderDetail()
     {
        _logger.Log("Inside GetOrderDetail"); // do we get a NRE here?
     }
}

Load it in the controller

External assembly being external kind of implies that we want to load it at the very last moment – right in our controller where we presumably need it. If we try explore this avenue, we immediately see the issue:

public HomeController(IDBLogger idbLogger)
 {
     _idbLogger = idbLogger;
     var assembly = Assembly.LoadFrom(Path.Combine("..\Controllers\bin\Debug\netcoreapp3.1", "Orders.dll"));
     var orderType = assembly.ExportedTypes.First(t => t.Name == "Order");
     var order = Activator.CreateInstance(orderType); //throws System.MissingMethodException: 'No parameterless constructor defined for type 'Orders.Order'.'
     orderType.InvokeMember("GetOrderDetail", BindingFlags.Public | BindingFlags.Instance|BindingFlags.InvokeMethod, null, order, new object[] { });
 }

The exception makes perfect sense – we need to inject dependencies! Making it so:

public HomeController(IDBLogger idbLogger)
 {
     _idbLogger = idbLogger;
     var assembly = Assembly.LoadFrom(Path.Combine("..\Controllers\bin\Debug\netcoreapp3.1", "Orders.dll"));
     var orderType = assembly.ExportedTypes.First(t => t.Name == "Order");
     var order = Activator.CreateInstance(orderType, new object[] { _idbLogger }); // we happen to know what the constructor is expecting
     orderType.InvokeMember("GetOrderDetail", BindingFlags.Public | BindingFlags.Instance|BindingFlags.InvokeMethod, null, order, new object[] { });
 }

Victory! or is it?

The above exercise is nothing new of exceptional – the point we are making here is – dependency injection frameworks were invented so we don’t have to do this manually. In this case it was pretty easy but more compex constructors can many dependencies. What’s worse – we may not be able to guarantee we even know all dependencies we need. If only there was a way to register dynamic types with system DI container…

Yes we can

The most naive solution would be to load our assembly on Startup.cs and register needed types along with our own:

public void ConfigureServices(IServiceCollection services)
 {
     services.AddControllersWithViews();
     services.AddScoped();
     // load assembly and register with DI  
     var assembly = Assembly.LoadFrom(Path.Combine("..\\Controllers\\bin\\Debug\\netcoreapp3.1", "Orders.dll")); 
    var orderType = assembly.ExportedTypes.First(t => t.Name == "Order");
    services.AddScoped(orderType); // this is where we would make our type known to the DI container  
    var loadedTypesCache = new LoadedTypesCache(); // this step is optional - i chose to leverage the same DI mechanism to avoid having to load assembly in my controller for type definition.  
    loadedTypesCache.LoadedTypes.Add("order", orderType);
    services.AddSingleton(loadedTypesCache); // singleton seems like a good fit here            
 }

And that’s it – literally no difference where the type is coming from! In controller, we’d inject IServiceProvider and ask to hand us an instance of type we cached earlier:

public HomeController(IServiceProvider serviceProvider, LoadedTypesCache cache)
{
     var order = serviceProvider.GetService(cache.LoadedTypes["order"]); // leveraging that same loaded type cache to avoid having to load assembly again
 // following two lines are just to call the method 
    var m = cache.LoadedTypes["order"].GetMethod("GetOrderDetail", BindingFlags.Public | BindingFlags.Instance); 
    m.Invoke(order, new object[] { }); // Victory!
}