Language design : How to type higher order functions parameters?

https://softwareengineering.stackexchange.com/questions/395371

28-02-2021
|

Question

I'm trying to design a theoretical programming language and I'm facing a problem with high order functions. The language is strong-typed, so the way to define a standard function is like so :

function name ( type1 arg1 , type2, arg2, ... ) : return_type

There's also a way to define data structures like classes (that are types) :

class MyClass
{
    type1 attribute1;
    type2 attribute2;
    ...
}

Now, I would like to extend the function definition to higher-order so that parameters and return values can be classes (types), attributes or functions. For instance having the capability of defining a function that takes as parameter a class or a class member. Let's take an exemple :

function sort ( List<something> data, ??? attribute_name , ??? sorting_function ) : List

This function is supposed to take as first parameter as List of something-typed instances, as second parameter an attribute of the something class, and as third parameter a comparison function. And attribute_name and sorting_function have to be typed. But how ? What should be instead of the ??? ?

I consider an heresy to create a function, class, or attribute type, since they wouldn't be at the same logic-level than data types. This would also imply to have type type, which is even more confusing.

I'm probably missing something in my language-design logic because I'm somehow trying to mix at the same logic-level data and language structure but I totally don't know how to achieve what I want otherwise.

Edit :

To give more details, what matters to me is to have a true seperation between data and models. For instance what we can do in languages such as python or php like :

getattr(obj, attr_name)

$obj->$attr_name

doesn't suit my logic since we use data-level (strings) to access attributes, which are structure-level.

In my sort function I gave as example (let's forget the last parameter) we have :

function sort ( List<something> data, ??? attribute_name )

The type of attribute_name cannot be a string, since attributes are not data-level. There should be proper type that designate attributes (class members). But such a type is not a data-type. The real problematic is finding a way to distinguish data-types from strucutre/meta-types that correspond to the different langauge structures.

Solution

I consider an heresy to create a function, class, or attribute type, since they wouldn't be at the same logic-level than data types. This would also imply to have type type, which is even more confusing.

Yet a Type type is a staple of almost all object-oriented languages, and for good reason - it lets you treat type information objects, functors, and actual instances of objects similarly. Of course, there's nothing wrong with providing some syntactic sugar on the language level, for example:

function Sort<T, P>(List<T> list, .T - > P property, (P, P-> int) comparer) {
   ...

}

Sort(list, MyClass.Price, (p1, p2) -> return...)

You can then treat property and comparer as special constructs... or simply have your compiler reduce it to a regular function taking objects:

function Sort<T, P>(List<T> list, Property<T, P> property, Function<P, P, int> comparer)...

That should give you a lot of features out of the box that you'd otherwise have to handle specially. Want to assign function to a variable? Just declare one of type Function<... > and do it. Want to make sure the property matches the comparer? Use the same type checking code. Want to map strings to functions to call? No problem, it's just an object map.

Bottom line: while on the language level you can absolutely use different syntax for all those cases, it'll help you greatly to reduce them to a common denominator as early as possible.

OTHER TIPS

Many functional languages have simple solutions for describing function types. The most common is a syntax rougly like the following, which is used to describe the type of any expression:

expr: a where a is the type of the expression. If the expression has a type that is a function, it will be:

expr: a -> b where a is the parameter type and b is the return type. The signature for a sort function would be something like this:

sort: (List<a>, a -> b) -> List<a>

So, sort takes a List<a> and a function from a to b, where b is the type of the field upon which the list is being sorted, and returns a List<a>.

Now, in many functional languages, the concept of currying is employed, such that all functions take exactly one parameter. Any functions like sort which you might think of as taking two parameters are rewritten such that they take one parameter at a time and return a function taking the next parameter. That allows the type signature to look more like:

sort: List<a> -> (a -> b) -> List<a>

However, it is also generally common practice to put the most important parameter last, so you would rewrite sort like this:

sort: (a -> b) -> List<a> -> List<a>

The reason being, since each function takes one parameter and returns a function taking the next parameter you can "partially apply" a parameter like the field by which to sort and create a new function that sorts by that field which you can re-use, such as:

let sortByName = sort (item -> item.Name)

You can look at languages like Haskell and F# to see how having an entire language designed around higher-order functions is handled in the type system.

I consider an heresy to create a function, class, or attribute type, since they wouldn't be at the same logic-level than data types.

But if you are able to pass a function as an argument to another function, then a function is a data type at the same logical level as other data types. And then you have to be able to represent a function as a data type in the type system.

For example the type of a function which takes a string and returns an integer would be Func<string, integer> in C# but string=>integer in TypeScript. What syntax you chose is up to, you but you need some way to represent the function signature in the type system.

But while first-class functions are really useful, I don't think first-class attributes are that useful. A more general solution to the "sort by attribute" use case would be to pass a function which returns the attribute value to sort on. This avoids the thorny issue of how to express a reference to an attribute.

Using the Foo=>Bar syntax for function types, the signature for the sorting function could be:

function sort<itemType, attributeType> ( 
   List<itemType> data, 
   itemType=>attributeType attribute_selector , 
   attributeType=>int sorting_function ) : List<itemType>

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange