Which programming languages support type inference from variable names? [closed]

https://softwareengineering.stackexchange.com/questions/335582

01-01-2021
|

Question

Which programming languages support type inference from variable names?

By 'type inference', I mean, for example, in Swift, how if you let x = 39 then the compiler knows x is an Integer, because 39 is an integer.

Now, it's conventional in Swift to use the keyword is at the beginning of Boolean variables. For example let isCool = true. However if we do let isCool = 0, the compiler will infer the type of isCool to be Integer instead of Bool, even though any human programmer would know it was supposed to be a Bool due to the presence of is at the beginning of the variable name.

So, it occurred to me, I wonder if there are any programming languages where you can give the compiler a clue to help it infer the type by using keywords in the variable name itself? Has anyone ever seen something like that?

Side note:

This was not intended to be a question about the merits of doing this, but just so you know, the reason I asked it was because I thought it is something that could be a feature that might allow for cleaner code in webservice integration layers, due to the necessity of parsing a lot of JSON into variables and languages like Swift making this a pain with all their enforced type safety.

So here is an example of how this might improve some code in Swift.

Suppose customersJSON is some data that came from a webservice response. If the server has a bug then the contents might not be the types we expect. So a strictly type safe language like Swift 3 currently enforces code that looks like the following:

    let customersJSON = "[{\"customerName\":\"Alice\",\"customerAge\":42},{\"customerName\":808,\"customerAge\":88}]".data(using: .utf8)!
    var customersArray: [Any]!
    do {
        customersArray = try JSONSerialization.jsonObject(with: customersJSON, options: JSONSerialization.ReadingOptions()) as? Array
    } 
    catch {
        print(error)
        return;
    }
    for customer in customersArray {
        guard let customerDict = customer as? [String:Any],
            let customerName = customerDict["customerName"] as? String,
            let customerAge = customerDict["customerAge"] as? Int 
            else {break}
        print("customerName: \(customerName), customerAge: \(customerAge)") 
    }

Note that when we assign dict["customerName"] to customerName, we have to tell the compiler that customerName should be a String with the syntax as? String.

Of course, human beings know that names are always strings; as a programmer it's just annoying to me to always have to tell the compiler this, if for example, I have several different things like productName and itemName and discountName etc. that must be handled in various response handlers. I'd much rather tell the compiler one single time that any variable whose name ends in "Name" is a String.

I'd much rather set a global preference in my app, using regex matching patterns to give the compiler a heads up, like:

Infer ".*Array":[Any]!
Infer ".*Name":String
Infer ".*Age":Int
Infer ".*Dict":[String:Any]

Then we can do:

    let customersJSON = "[{\"customerName\":\"Alice\",\"customerAge\":42},{\"customerName\":808,\"customerAge\":88}]".data(using: .utf8)!

    var customersArray
    do {
        customersArray = try JSONSerialization.jsonObject(with: customersJSON, options: JSONSerialization.ReadingOptions())
    } 
    catch {
        print(error)
        return;
    }
    for customer in customersArray {
        guard let customerDict = customer,
            let customerName = customerDict["customerName"],
            let customerAge = customerDict["customerAge"] 
            else {break}
        print("customerName: \(customerName), customerAge: \(customerAge)") 
    }

... and still have total type safety of Swift, without all the ugliness. Now it looks more like a dynamically typed language's code. I'm not saying it's "better" this way, but I do think it makes for more elegant, more readable code. YMMV

Solution

Very early versions of FORTRAN used the first letter of the variable name to denote the type, for basic types INTEGER and REAL. Double precision (extended-precision reals) had to be declared explicitly.

As of at least FORTRAN IV, at least some versions of FORTRAN allowed explicit declarations to override the first-letter rule.

FORTRAN 77 allowed one to specify one's own personal first-letter rules, and to disable them altogether. (Example: IMPLICIT INTEGER(NONE))

It should be mentioned that it was generally accepted in the computing community by at least the mid-1960s that default type rules and implicit variable declarations were a very bad idea. Tony Hoare reports having his ears soundly boxed by the ALGOL Committee when he proposed such modification to ALGOL. (He mentioned that this was before the probably-apocryphal story of the lost Venus probe, due to a typographical error that nevertheless yielded a syntactically-legal FORTRAN program, became widely known.)

OTHER TIPS

I've never seen anything like that, but wouldn't it be a cool feature to be able to set up some keywords to mean certain things, so that you could avoid repetitive boilerplate and have more readable code, to boot?

Sorry, no. This is not a good idea.

Variable declaration is a meaningful step, not "boilerplate".

A feature of good language design is that variable declarations must be explicit. Being able to use variables that have not been declared can make a minor typo into a significant bug, and also there can be quite problematic hidden problems caused by collisions in variable names between different scopes.

While there are some languages that allow variables without declaring them (Perl, Javascript, etc.), this is discouraged, except for in quick hacks/scripts, and usually language features have been added later to turn off this behavior.

So, that means that even in your theoretical language, good practice would still require a declaration line for any significant development. Something like this:

var intNum;

Does not save code compared to this:

int num;

It harms rather than aiding readability.

Reasons why it is not needed:

A variable's type is most often clear from the context.
Well-designed code will have plenty of hints about variable type.
- Good practice is to break code into small functions, which (in most languages) have a prototype indicating the type of the arguments.
- Variables should be declared as close to their point of use as possible, and globals should be avoided. This is important for other reasons, and means a variable declaration is often quite closeto the code you are working on.
Type information is also easily available in development tools.

Reasons why it may cause more harm than good:

It adds noise to the code. If you have a variable x that is used 10 times, being forced to name it intx instead adds a lot of extra unnecessary stuff to the code. And all variables of the same type now look kind of similar, making it hard to distinguish them from each other.
It really breaks down on more complex types. This scheme would be workable if you only had a few basic types like int, float, etc. But in modern languages (well, actually even in old languages like C), you can have user-defined types that are much more complex. How well does this idea work with object oriented programming, where the type of an object might be DoohickyFrobulatorFactory?
It also doesn't help with pointers/references. A pointer may be used to refer to objects of multiple types. While you could call it ptrFoo, arguably this isn't really all that helpful information. What is more meaningful is what sort of object it is pointing to. But the system doesn't help with that.
It makes it harder to change the type. What if you decide your int should be a long int, or similar? This should be a trivial change, but it's more complex if you have to rename the variable.

Of course, you already could use this naming convention if you wanted to (just not enforced by the compiler). It's actually an established naming convention called Hungarian Notation, and it was fairly popular at one point. But it is widely discouraged today. See the linked article for more information.

Fortran has already been covered so I will say no more about it….

I don’t know of any programming language that inference a type based on variable or parameter name. However there are lots of coding standards that recommend how variables, methods, classes etc should be named.

You can then use a coding style checker to confirm if the naming standards has been kept to, hence getting an warning (or error) from your build system if use have.

let isCool = 0

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange