Can syntactic `await` always be elided?

https://softwareengineering.stackexchange.com/questions/411482

12-03-2021
|

Question

(This question isn't a duplicate of Why do we need the async keyword? - it's more of the opposite: I'm not questioning the async keyword - I'm asking if compilers could elide the use of await completely behind the scenes, making async code syntactically identical to synchronous code)

The await keyword in many languages provides a succinct way to describe a continuation or for constructing coroutines - but I've wondered if it was necessary at all, as in situations where I've used it, the compiler should be smart enough to know when a task/promise/future should be awaited or not: by deferring any await until the Task is consumed as though it were awaited.

As an example, consider this async C# code that runs two Tasks concurrently:

    Task<Foo> fooTask = GetFooAsync();
    Task<Bar> barTask = GetBarAsync();

    DoSomethingElseSynchronously();

    Foo foo = await fooTask;
    Bar bar = await barTask;

    Foo foo2 = foo.Clone();
    DoSomething( foo2, foo, bar );

I was thinking that the compiler (or rather, some static-analysis code rewriter that runs before the real C#-to-IL compiler) could allow it to be written like so:

    Foo foo = GetFooAsync();
    Bar bar = GetFooAsync();

    DoSomethingElseSynchronously();

    Foo foo2 = foo.Clone();
    DoSomething( foo2, foo, bar );

The deferred-await would result in the above code being the same as though it were written like this:

    Task<Foo> fooTask = GetFooAsync();
    Task<Bar> barTask = GetBarAsync();

    DoSomethingElseSynchronously();

    Foo foo2 = (await fooTask).Clone();
    DoSomething( foo2, (await fooTask), await barTask );

Of course this only works if the re-inserted await is both idempotent (which is ostensibly is, at least with the stock Task<T> in .NET) and side-effect free (so reordering the await statements should not affect the correctness of the program).

I imagine most async C# code tends to immediately await a Task because most async APIs do not support concurrent operations on the same resources, so you must await one operation before starting another on the same object (e.g. DbContext doesn't support multiple concurrent queries, and FileStream requires each async read or write operation to be completed before starting another - though I might be wrong if FileSteam fully supports Windows' Overlapped IO functionality) - there's no built-in way in .NET for an asynchronous API to declare support for concurrent operations but a simple addition to the TaskCompletionSoure and Task API could enable this.

Another example of concurrent async operations is firing off a batch of HTTP requests using a single HttpClient instance - for example a web-crawler might work like this:

List<Uri> uris = ...

HttpClient httpClient = ...

List<Task<List<Uri>>> tasks = uris
    .Select( u => httpClient.GetAsync() /* Returns HttpResponseMessage */ )
    .Select( response => ReadPageUrisAsync( response ) /* Returns Task<List<Uri>> */ )
    .ToList();

List<Uri> foundUris = ( await Task.WhenAll( tasks ) )
    .SelectMany( uris => uris )
    .Distinct()
    .ToList();

If the await were syntactically elided, the compiler should be smart enough to infer that Task.WhenAll( tasks ) call-site expected an awaited return value because the following Linq expression only works if the IEnumerable<T> source parameter for .SelectMany is a List<List<Uri>> and not a List<TaskList<List<Uri>>> (I do appreciate this kind of type-inference is a hard problem - I'm using it as a contrived example).

So - assuming that a program's asynchronous operations can be safely awaited out-of-order, is there a reason or situation where await couldn't be syntactically elided?

La solution

Of course this only works if the re-inserted await is both idempotent (…) and side-effect free (so reordering the await statements should not affect the correctness of the program).

But await is used precisely in order to enforce a particular ordering. Await lets you enforce that an async operation has happened before you continue with other stuff. In many cases, the compiler would be able to enforce a sensible ordering simply by considering data dependencies. E.g. if a function wants an int, the Task<int> object must be awaited first.

However, this is not the case for async tasks that produce side effects.

Let's say we have a resource LightSwitch that we can turn on and off. If the light is already on, turning it on has no effect (and the same when it's turned off). This example is well-ordered, and we know that the switch will end in the off state, assuming that we have sole control over the switch:

LightSwitch light = ...;

await light.TurnOnAsync();

light.TurnOffSync();

In contrast, the result of this ordering is indeterminate:

var turnOnTask = light.TurnOnAsync();

light.TurnOffSync();

await turnOnTask;

Perhaps turning the light on happens immediately, perhaps it happens later and the light ends in on state.

So when a task is awaited can have very fundamental consequences for the correctness of some code. There are two approaches to resolve this: either, data dependencies encode all dependencies (as in pure functional programming), or we are able to explicitly specify an ordering (as with the imperative await approach).

In the pure functional approach, effects must be represented as values so there are no side effects. This could be approximated in C# with a fluent approach, where each operation returns a token representing the next state:

initialLightState()
  .TurnOnAsync()
  /* implicit await here? */
  .TurnOff();

The difference between sync and async more or less disappears here. However, this approach cannot prevent ambiguous effect orderings unless the type system can enforce that any token is used at most once. For example, Rust provides such a type system, but has no mechanism for implicit awaits.

Licencié sous: CC-BY-SA avec attribution

Non affilié à softwareengineering.stackexchange