Robust architecture with EF Core

https://softwareengineering.stackexchange.com/questions/398511

02-03-2021
|

Question

I'm trying to figure out how it's possible to robustly code against EF Core in a large, multi-tiered codebase. We have been experiencing several issues, and most of the example project architectures online are very simplistic and don't provide much insight.

The most common questions we come up against are the following. In a service class (e.g. performing business logic), where the caller (often other service classes) passes Entity Framework entities:

Who is responsible for calling SaveChanges?
Should (and if so how) the service class ensure that all necessary dependent relationships have been loaded?
If dependencies have not been loaded, should the service class load them - or should the service class throw an exception? Is there any better way of communicating these dependencies to the caller?
How should the lifecycle of the database context be managed?
Who is responsible for transactions?
Should service classes have a reference to the database context?

For reference, our application has components that are traditional REST APIs, other components that are GraphQL APIs, cron-based scheduled task components, long running batch calculation processes, etc. that all use EF to access data in the database. Most example architectures for EF Core are a simple web app.

Architectural solutions I see suggested online are the following:

CQRS

Commonly suggested that it obviates the need to handle a lot of these difficult problems. However, I feel like it introduces a lot of even more difficult problems (e.g. eventual consistency). Am I wrong?

EF Core as a persistence layer

Don't ever directly touch EF entities from business/domain services and domain objects: implement a mapping layer between EF entities and domain objects using either the repository and unit of work patterns.

This seems to eliminate most of the advantages of using Entity Framework in the first place. If one takes this approach, why not just use a micro-ORM?

No correct solution

OTHER TIPS

I've always liked NorthWindTraders as a clear example of how to properly use EF in a multi-layered application, without needlessly resorting to rolling your own UoW or persistence layer (which is a bigger hassle than it is a solution.

Who is responsible for calling SaveChanges?

That depends on who is doing the data operations. If your service calls underlying repository methods/query objects/command objects, then it's those repository methods/query objects/command objects that call SaveChanges().

However, in some cases your service layer might be the command/query objects, at which point they will handle the context and calling SaveChanges() by themselves. It very much depends on your architecture.

Should (and if so how) the service class ensure that all necessary dependent relationships have been loaded?

I assume by "dependent relationships" you mean "related entities". If your service needs them (or is tasked with ensuring their existence when the consumer needs them), and doesn't have them yet, fetching them is the obvious logical result, assuming you have the necessary information to do so.

If dependencies have not been loaded, should the service class load them - or should the service class throw an exception? Is there any better way of communicating these dependencies to the caller?

This is not related to EF, but to exception management. Whether to throw an exception, return a negative result object, or simply (attempt to) fix the problem is contextual, based on your expectations and that of your consumer.

I suggest browsing SO/SE/Google on exception handling as this has been discussed many times over

How should the lifecycle of the database context be managed?

For web applications, a scoped dependency is usually the better way to go about it. Most web requests execute a single atomic operation, and thus scoping the context to the web request is the most appropriate.

For non-web applications, you'll require more active scope management. In a Windows service for example, you shouldn't be injecting your context directly, but rather a context factory, since otherwise you'll keep using the same context for the entire runtime of the Windows service. From experience, that is a source of bugs you do not want to tackle.

Who is responsible for transactions?

The service layer essentially scopes to the atomic operation, and transactions are the database representation of an atomic operation. Therefore, the service calls the shots on when to use a transaction - if needed.

Having your command objects (or repository methods) call SaveChanges() and having your service layer wrap them in a transaction creates an interesting dynamic that you can change at will:

If the service layer defines no transaction, the command object will be allowed to immediately commit the data to the data store.
If the service layer defines a transaction, that same command object will not be able to commit the data to the data store, instead being held back until the transaction is committed.

This means that your service layer is completely free to mix and match which operations (command objects) are part of a single atomic operation; without you need to change or complicate your command objects.

A small example:

public class CommandObject
{
    private readonly MyContext myContext;

    public CommandObject(MyContext myContext)
    {
        this.myContext = myContext;
    }

    public void CreateFoo()
    {
        myContext.Foos.Add(new Foo());

        myContext.SaveChanges();
    }
}

public class SeparateFooService
{
    private readonly CommandObject commandObject;

    public SeparateFooService(CommandObject commandObject)
    {
        this.commandObject = commandObject;
    }

    public void CreateTwoFoos()
    {
        commandObject.CreateFoo();
        commandObject.CreateFoo();
    }
} 

public class TransactionalFooService
{
    private readonly MyContext myContext;
    private readonly CommandObject commandObject;

    public TransactionalFooService(MyContext myContext, CommandObject commandObject)
    {
        this.myContext = myContext;
        this.commandObject = commandObject;
    }

    public void CreateTwoFoos()
    {
        using (var transaction = myContext.Database.BeginTransaction())
        {
            commandObject.CreateFoo();
            commandObject.CreateFoo();

            transaction.Commit();
        }
    }
}

SeparateFooService takes two separate actions. If the second fails, the first will still have succeeded. TransactionalFooService will either create two or zero Foo entities, never just one.

The interesting part is that the same command object is being used in either case, which means that the service layer has sole control over transactional behavior without any dependency needing to be aware of it.

Should service classes have a reference to the database context?

The nice thing about scoped DI is that you can opt-in to dependencies.

Does every service class have a reference to the database context? Usually not. It could just rely on other dependencies such as query/command objects or repositories and leave the context handling to them.

But if a service does require a context, e.g. for the transactions as shown above, then it can simply add a dependency and the DI framework (when set up correctly) will provide access to the exact context that the underlying query/command objects or repositories use.

CQRS However, I feel like it introduces a lot of even more difficult problems (e.g. eventual consistency). Am I wrong?

When using scoped contexts in a web application, the same context is persisted throughout the web request. In a non-web application, it requires a bit more active handling, but you can similarly ensure that the same context is reused by your DI framework exactly where you need it to be (without making it a singleton, which you wouldn't want).

I'm unsure if your worry over eventual consistency stems from the interpretation that CQRS leads to asynchronous handling. It doesn't (inherently). CQRS is still linear, synchronous code, and the data store won't be treated any differently compared to e.g. a monolithic method which does everything (not that that's a good idea, but it is a clear example of linear synchronous code).

That being said, it is possible to do this asyncrhonously, but that would also be an option (with the same potential problem) even if you weren't using CQRS; so CQRS doesn't factor into this.

Don't ever directly touch EF entities from business/domain services and domain objects: implement a mapping layer between EF entities and domain objects using either the repository and unit of work patterns.

This seems to eliminate most of the advantages of using Entity Framework in the first place.

I wholeheartedly agree. I've written previous answers detailing exactly why rolling your own UoW/repository is just a needless mirrored wrapper around EF. To sum it up, it's called Entity Framework, not Entity Library, specifically because it can't and shouldn't just be abstracted away. The benefit does not even remotely outweigh the cost.

However, separating your entities from your domain objects (or even just DTOs if your backend is just a dumb resource store) is still a good idea. That doesn't mean that your entity and domain model/DTO have to be different, but they do need to be separate.

How you map between these separate classes is a matter of finding the right library or writing it by hand.
One possible approach here is Automapper's ProjectTo<TDomainModel>(), which is a non-intrusive way of effectively and immediately wrapping your EF query into non-entity classes to ensure that your codebase doesn't actually handle your entity classes any more than it needs to.

We have been experiencing several issues, and most of the example project architectures online are very simplistic and don't provide much insight.

The problem with EF discussions is that different people draw the line in different places, and in an online conversation that's not always apparent.

My answer is tailored to working in a large, multi-tiered codebase. Had you been talking about a small scale project, I would've advocated a much simpler approach.

I think you should develop a Service Layer.

EF --> Service Layer --> Domain Aggregates

Define your domain Aggregates in terms of the business operations you want. For example, the method GetInvoice will return an aggregate containing the customer information, shipping address, product line items and quantities.

If you do this, I think you will find that your other questions will be naturally answered. How is the life cycle of the EF DataContext handled? One DataContext object per invoice retrieval. Who is responsible for calling SaveChanges? Your service layer. Who handles transactions? Your service layer.

I think people have a tendency to view things like CQRS and Onion Architecture as recipes, rather than as organizing principles. CQRS is just another form of Single Responsibility; CRUD is just another form of CQRS. If you organize your architecture in ways that make sense, your application will naturally follow these organizing principles.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange