Would a “downcast if block” be a reasonable language feature?

https://softwareengineering.stackexchange.com/questions/236333

03-10-2020
|

Question

Consider the following "if cast":

class A     { public void f1() { ... } }
class B : A { public void f2() { ... } }

A a = foo(); // might return A or B

if ( a is B ) {
    //  Inside block, implicitly "redeclare" a as type B
    //  It's B, go for it
    a.f2();
    //  Wouldn't compile.
    a = new A();
}

We don't enter the block unless a is a B, therefore within the block, the compiler treats a exactly as if it had been declared as type B. You could assign null to it in the block if it's a reference type, but it would be a compile-time error to assign new A() to it, for example (thanks @ThomasEding for bringing up that question).

Anything that could break that assumption that a is B, could just as easily break the following legal C# as well:

if ( a is B ) {
    B b = a as B;

    b.f2();
}

The former looks to me like syntactic sugar for the latter.

Even if this feature makes sense, I'm sure Anders Hjelberg's lads have more useful features to implement. I'm just wondering if it's a worse idea than I think it is.

UPDATE A serious objection would be that the feature as described adds semantics without adding syntax: From one version of the language to the next, the meaning of the above block would change significantly. It would be more tolerable with a new keyword. But then you'd be adding a new keyword.

Update

A more redable version of this feature was implemented in C#7:

object object2 = new List<Test>();

if (object2 is Dictionary<String, int> foo)
{
    Console.WriteLine(foo.Keys);
}

Solution

Let's assume features start with 0 points, rather than with -100 points (due to costs associated with implementation, design, etc.).

This feature strikes me as confusing. I've already declared that a is an A. If I want to know the compile-time type of a, I only have to look in one place, at the declaration. While I acknowledge that the implementation is relatively simple (use the code substitution you describe in your question), this also argues in favor of not implementing the feature; it's easy for coders to do it, too.

The truly horrifying thing is that the type of a variable can magically change immediately after a }. C# very deliberately prevents this from happening (see Simple names are not so simple). So, I absolutely hate any version of this feature that fails to introduce a new variable during the cast.

If you find yourself constantly casting your types so as to access members of derived classes, it makes me suspect you are misusing inheritance. Personally, usually in code where I'm using such a conditional, I'm going to break out of my function if the cast fails. So, my code would look more like this:

B myB = myA as B;
if (myB == null) return;

Which would, under some variant of your proposal, become this:

if (!(myA is B)) return;

or (introducing a new name)

if (!(myA to B myB)) return;

or, using using declarative expressions (this is a variant on Jimmy Hoffa's answer)

if (!a.To(out B myB)) return;

With the exception of the approach using declarative expressions, all of these features require learning new C# syntax for a feature that is not at all painful to deal with. The declarative expression approach is nice in that it is the general case of a specific feature; i.e. allowing coders to make use of an assignment and conditional using a single expression.

The only cost to the declarative expression approach is that it won't work universally; it only works if your special To method is is implemented and visible in your current scope. However, that's a price I'm willing to pay to avoid adding yet more syntax to the C# language; I won't bother using such an extension method unless I'm dealing with code which constantly requires it (and probably not even then).

In summary: This feature isn't needed quite often enough to justify adding new syntax to the language. However, the more general feature, declarative expressions, may be worth the cost (which is why Microsoft is probably going to pay it). It not only handles this case, but also handles the annoying two-statement TryParse case (and, like your proposal, makes for clean scope; the type is only in scope on the inside of the conditional).

Update: Declarative Expressions were cancelled.

Update 2: C# 7.0 is adding "out variables". This might be enough for my final example (if (!a.To(out B myB)) return;). C# 7.0 is also adding "Is-expressions with patterns", which allows, if (!(o is int i)) return;. This is basically the feature you asked for.

OTHER TIPS

As a data point, the Ceylon language allows¹ you to do this.

In Ceylon, you would write:

A a = ...
if (is B a) {
   // refer to 'a' as a 'B'
}

if (is B b = some_expression_returning_A ) {
   // refer to 'b' as a 'B'
}

There is also a variation of the switch statement that allows you to switch on types, in much the same way. For example:

void switchOnEnumTypes(Foo|Bar|Baz var) {
    switch(var)
    case (is Foo) {
        print("FOO");
    }
    case (is Bar) {
        print("BAR");
    }
    case (is Baz) {
        print("BAZ");
    }
}

^{1 - Actually, it requires this, since Ceylon doesn't have an explicit type-cast. The goal is to avoid core language constructs that can throw exceptions. They do something similar to design out NPEs.}

It's not that it's a bad idea, it's just that there's no point in elevating this to the point of language feature, not even syntactic sugar, you can very easily implement this yourself in C#:

public static class ExtensionMethods
{
    public static void AsIf<T>(this object target, Action<T> todo)
    {
        if (target is T) todo((T)target);
    }
}

a.AsIf(aAsB => {your code});

tl;dr: Concurrency Issues

When you write A a; you are effectively making the statement "a is a variable that can contain objects representable by the type A". By changing the variable's type to something lower in the food chain, you would potentially introduce concurrency issues.

For example

namespace AsIf
{
class A { public void FA() {} }
class B : A { public void FB() {} }
class C : A { public void FC() {} }

class Program
{
delegate void ActionOnA(ref A a);
static void AToC(ref A a) { Thread.Sleep(5000); a = new C(); }

void main()
{
A a = new B();

// Wait 5 seconds, change a to new C()
ActionOnA action = AToC;
action.BeginInvoke(ref a, (ar) => { action.EndInvoke(ref a, ar); }, null);

if(a is B)
{
// a is implicitly retyped as B for the rest of this block

// B.FB get called on a (an instance of B)
a.FB();
// Sleep 6 seconds
Thread.Sleep(6000);
// B.FB gets called on a (an instance of C)
a.FB();
// 'Undefined behaviour' most likely causing all kinds of crazy
}
}
}
}

This is slightly less likely to happen when you introduce b

// assume everything else is as before

void main()
{
A a = new B();

// Wait 5 seconds, change a to new C()
ActionOnA action = AToC;
action.BeginInvoke(ref a, (ar) => { action.EndInvoke(ref a, ar); }, null);

if(a is B)
{
// introduce b
B b = (B)a;

// B.FB get called on b (an instance of B)
b.FB();
// Sleep 6 seconds
Thread.Sleep(6000);
// B.FB gets called on b (an instance of B)
b.FB();
// Safe! (for now)
}
}

This is because B is an entirely different variable being used to hold the same object. What the concurrent function is actually doing is assigning to the variable a which is not directly affecting the object a was holding. Because b now holds a reference to the object that was in a, a can get reassigned and nobody would care.

Of course this is an implementation detail. If the compiler were to secretly create a second variable for the asif block rather than just reinterpreting a as being of the type B, the result would be effectively the same as my second example. Whether such behaviour is expected, obvious or desired is a different matter entirely.

It's not an impossible feature, it's just one that's more work than it's worth really.

This can be done with the new is-expressions coming in C# 7

class A     { public void f1() { ... } }
class B : A { public void f2() { ... } }

A a = foo(); // might return A or B

if ( a is B b ) {
    b.f2();
}

I like the idea. It's also not dangerous if you take into consideration some cases.

Consider

if (a is B) {
    a.BMethod(); // OK
    a = new B(); // OK
    a.BMethod(); // OK
    a = null;    // Disallow? (Probably OK)
    a.BMethod(); // Disallow? (Probably OK)
    a = new A(); // Disallow? (Probably bad)
    a.BMethod(); // Disallow?
}

It certainly is possible to write your extension to fail to compile in such a case, but one would want clean error messages, need to define more restrictions (such as you could only assign a B to a inside the if-block), throw a runtime exception, or something else.

Consider multi-threaded code using a, and a gets reassigned inside the as-if block. This would be very hard to do properly in a direct C# translation (it's possible, but not very nice *). This feature wants to be done at the byte code level, where it is trivial to implement.

(*): You would need to assign to a temporary, then assign to the original a, then assign to the a_as_b compiler generated variable. Even still, I may be missing some important edge cases.

I've seen an AsIf extension method implemented as such:

private static void AsIf<TBase, TChild>(this TBase source, Action<TChild> action) where TChild : TBase
{
    if ((source is TChild) && (action != null))
    {
        action((TChild)source);
    }
}

The generic constraints make sure you're attempting to "as-if" something hierarchically allowable.

This is already a common feature in many programming languages. They are called sum types. Classes are an awkward way to simulate sum types in languages that lack them.

Sum types (also called variants and unions) are the dual to product types (commonly called tuples). Just as product types allow you to stick to types together to create a compound type that is composed of type 1 and type 2, sum types allow you to create a compound type that is either type 1 or type 2.

Here's how it works. Let's say we want to represent internet addresses. There are two ways to address a computer on the internet: using its IP address or its fully-qualified domain name (FQDN). The FQDN is a human-readable string, while the IP address is an integer between 0 and 2^32.

Since we want functions that can operate on either of these schemes, we create a sum type

type InternetAddress = IP of int | FQDN of string

Note that InternetAddress is really just a type alias for int + string. The identifiers IP and FQDN serve as labels (also called constructors or injections) for each "branch".

To create a value of type InternetAddress, we inject a value into the sum type

val my_address: InternetAddress = FQDN "mydomain.com"

To use a value of a sum type, we perform case analysis. The instance type of the value determines which "branch" of the expression we take:

case my_address of
     IP ip_address -> *branch 1*
     | FQDN host_name -> *branch 2*

In either branch, the underlying value of my_address binds to the declared variable. So in branch 1, we can use ip_address which has type int, and in branch 2 we can use host_name which has type string.

In my opinion, sum types are an indispensable feature of programming languages, and the fact that they are so scarce in today's mainstream languages has caused considerable confusion and errors over the years.

In Swift you write:

if let b = a as? B {
    ....
}

"a as B" would try to convert a to type B - which for object types would succeed if a actually belongs to class B which would need to be a subclass or superclass of A to ever succeed. Of course this would only change the type of the reference, not the type of the object behind it. And it would crash on purpose if the conversion fails.

"a as? B" is similar except it returns an "optional B"; the result is nil if the conversion fails.

"if let b = a as? B" evaluates (a as? B). If the result is nil then the if is not executed. If the result is not nil then it assigns the reference to a non-optional b of type B and executes the if.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange