Aren't the guidelines of async/await usage in C# contradicting the concepts of good architecture and abstraction layering?

https://softwareengineering.stackexchange.com/questions/382486

16-02-2021
|

Domanda

This question concerns the C# language, but I expect it to cover other languages such as Java or TypeScript.

Microsoft recommends best practices on using asynchronous calls in .NET. Among these recommendations, let's pick two:

change the signature of the async methods so that they return Task or Task<> (in TypeScript, that'd be a Promise<>)
change the names of the async methods to end with xxxAsync()

Now, when replacing a low-level, synchronous component by an async one, this impacts the full stack of the application. Since async/await has a positive impact only if used "all the way up", it means the signature and method names of every layer in the application must be changed.

A good architecture often involves placing abstractions between each layers, such that replacing low-level components by others is unseen by the upper-level components. In C#, abstractions take the form of interfaces. If we introduce a new, low-level, async component, each interface in the call stack needs to be either modified or replaced by a new interface. The way a problem is solved (async or sync) in an implementing class is not hidden (abstracted) to the callers anymore. The callers have to know if it's sync or async.

Aren't async/await best practices contradicting with "good architecture" principles?

Does it mean that each interface (say IEnumerable, IDataAccessLayer) needs their async counterpart (IAsyncEnumerable, IAsyncDataAccessLayer) such that they can be replaced in the stack when switching to async dependencies?

If we push the problem a little further, wouldn't it be simpler to assume every method to be async (to return a Task<> or Promise<>), and for the methods to synchronize the async calls when they're not actually async? Is this something to be expected from the future programming languages?

Soluzione

What Color Is Your Function?

You may be interested in Bob Nystrom's What Color Is Your Function¹.

In this article, he describes a fictional language where:

Each function has a color: blue or red.
A red function may call either blue or red functions, no issue.
A blue function may only call blue functions.

While fictitious, this happens quite regularly in programming languages:

In C++, a "const" method may only call other "const" methods on this.
In Haskell, a non-IO function may only call non-IO functions.
In C#, a sync function may only call sync functions².

As you have realized, because of these rules, red functions tend to spread around the code base. You insert one, and little by little it colonizes the whole code base.

¹ Bob Nystrom, apart from blogging, is also part of the Dart team and has written this little Crafting Interpreters serie; highly recommended for any programming language/compiler afficionado.

² Not quite true, as you may call an async function and block until it returns, but...

Language Limitation

This is, essentially, a language/run-time limitation.

Language with M:N threading, for example, such as Erlang and Go, do not have async functions: each function is potentially async and its "fiber" will simply be suspended, swapped out, and swapped back in when it's ready again.

C# went with a 1:1 threading model, and therefore decided to surface synchronicity in the language to avoid accidentally blocking threads.

In the presence of language limitations, coding guidelines have to adapt.

Altri suggerimenti

You are right there is a contradiction here, but it is not the "best practices" being bad. It is because asynchronous function does essentially different thing than a synchronous one. Instead of waiting for the result from its dependencies (usually some IO) it creates a task to be handled by the main event loop. This is not a difference which can be well hidden under abstraction.

An asynchronous method behaves differently than one which is synchronous, as I'm sure you're aware. At runtime, to convert an async call to a synchronous one is trivial, but the opposite cannot be said. So therefore the logic then becomes, why don't we make async methods of every method which may require it and let the caller "convert" as necessary to a synchronous method?

In a sense it is like having a method which throws exceptions and another which is "safe" and won't throw even in case of error. At what point is the coder being excessive to provide these methods which otherwise can be converted one to another?

In this there are two schools of thought: one is to create multiple methods, each one calling another possibly private method allowing for the possibility of providing optional parameters or minor alterations to behavior such as being asynchronous. The other is to minimize interface methods to bare essentials leaving it up to the caller to perform the necessary modifications himself/herself.

If you're of the first school, there's a certain logic to dedicating a class towards synchronous and asynchronous calls in order to avoid doubling every call. Microsoft tends to favor this school of thought, and by convention, to remain consistent with the style favored by Microsoft, you too would have to have an Async version, in much the same way that interfaces almost always start with an "I". Let me stress that it isn't wrong, per se, because it is better to keep a consistent style in a project rather than do it "the right way" and radically change style for the development that you add to a project.

That said, I tend to favor the second school, which is to minimize interface methods. If I think a method may be called in an asynchronous way, the method for me is asynchronous. The caller can decide whether or not to wait for that task to finish before proceeding. If this interface is an interface to a library, there is more reasonable to do it this way to minimize the number of methods you'd need to deprecate or adjust. If the interface is for internal use in my project, I will add a method for every needed call throughout my project for the parameters provided and no "extra" methods, and even then, only if the behavior of the method isn't already covered by an existing method.

However, like many things in this field, it's largely subjective. Both approaches have their pros and cons. Microsoft also started the convention of adding letters indicative of type at the beginning of the variable name, and "m_" to indicate it is a member, leading to variable names like m_pUser. My point being that not even Microsoft is infallible, and can make mistakes too.

That said, if your project is following this Async convention, I would advise you to respect it and continue the style. And only once you're given a project of your own, you can write it in the best way you see fit.

Let's imagine there is a way to enable you to call functions in an async way without changing their signature.

That would be really cool and no-one would recommend you change their names.

But, actual asynchronous functions, not just ones that await another async function, but the lowest level have some structure to them specific to their async nature. eg

public class HTTPClient
{
    public HTTPResponse GET()
    {
        //send data
        while(!timedOut)
        {
            //check for response
            if(response) { 
                this.GotResponse(response); 
            }
            this.YouCanWait();
        }
    }

    //tell calling code that they should watch for this event
    public EventHander GotResponse
    //indicate to calling code that they can go and do something else for a bit
    public EventHander YouCanWait;
}

It's those two bit of information that the calling code needs in order to run the code in an async way that things like Task and async encapsulate.

There is more than one way to do asynchronous functions, async Task is just one pattern built into the compiler via return types so that you don't have to manually link up the events

I will address the main point in a less C#ness fashion and more generic :

Aren't async/await best practices contradicting with "good architecture" principles?

I would say that it just depends of the choice you make in the design of your API and what you let to the user.

If you want one function of your API to be only async there is little interest into following the naming convention. Just always return Task<>/Promise<>/Future<>/... as return type, it is self documenting. If wants a synchronise answer, he will still be able to do that by waiting, but if he always do that, it make a bit of boilerplate.

However if you make only your API sync, that mean that if a user want it to be async, he will have to manage the async part of it himself.

This can make quite a lot of extra work, however it can also give more control to the user about how many concurrent call he allows, place timeout, retrys and so on.

In a large system with a huge API, implementing most of them to be sync by default might be easier and more efficient than managing independently each part of your API, specially if they share ressources (filesystem, CPU, database, ...).

In fact for the most complex parts, you could perfectly have two implementations of the same part of your API, one synchronous doing the handy stuff, one asynchronous relying on the synchronous one do handle stuff and only managing concurrency, loads, timeouts, and retries.

Maybe someone else can share his experience with that because I lack experience with such systems.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange