Pergunta

Implementing operator overloading for e.g. + (plus) isn't terribly difficult if you know it's a binary operator. One can just parse expression + expression

But what if the programmer can choose whether + is binary, unary (prefix/postfix), part of a ternary, or something else? E.g.

  • .. ? .. : ..
  • ..²
  • .. | .. | .. | .. | ..
  • .. 大 .. 小 ..
  • 不 ..

The problem seems to be that the arity should be known at parsing time, but it isn't, it's only known after the whole compiling/transpiling.

Some ideas:

  • I think the whole problem kind of goes away by switching to Lisp-like syntax (not familiar enough with Lisp yet), but I'm gonna make it difficult by saying the syntax has to be somewhat C-like as in the examples.
  • Some kind of restriction like having overloads at the top / in a separate file so they can be compiled first.
  • (Maybe Arity must be fixed for each symbol - doesn't really answer the question though).

Is there a way that isn't terrible?

Does it become more possible by assuming operators can be distinguished from other identifiers at the parsing phase (e.g. other identifiers are alphanumeric)?

Foi útil?

Solução

If a language supports user-defined operators (as opposed to overloading of existing operators), then the language is generally structured in a way so that arity, associativity, and precedence are known at parsing time!

Of course, this also prevents you from using a run-of-the-mill parser generator, but there are alternatives (like writing a parser by hand).

An operator must be declared before it is used. This is similar to how C/C++ functions must be declared before they are called.

Example: Haskell. Here operators can be freely declared from a set of allowed characters. You can specify associativity, arity, and precedence level with a declaration like infixr 5 **. Otherwise, operators are declared like ordinary functions, e.g. (++) a b = concat a b. When operators are enclosed to one or both sides with parentheses, they lose their arity, e.g. we can write equivalently a + b, (a +) b, (+ b) a, (+) a b.

Example: Scala. In Scala operators are just normal methods, and methods don't have to be invoked with dot notation: so a.m(b) is equivalent to a m b. So if we define a method called ++ we can write a ++ b. Some operators are more special and can only be overloaded with specially named methods, e.g. the function call operator with the apply() method. The precedence level of an operator is determined by the first character in the operator. So the operators +, +*, and +| all have the same precedence.

Example: Perl6. The Perl6 language includes a powerful grammar engine. New rules can be added to the grammar on the fly, for example by declaring a subroutine as an operator. An operator is always declared with an arity, so multi sub prefix:<++>($x), multi sub infix:<++>($x, $y), and multi sub postfix:<++>($x) can coexist as different operators. Operators are not restricted to symbols but can also contain letters. All in all the operator system of this language is extremely complex and flexible.

Outras dicas

You didn't mention the language.

In Swift, there is first a syntax for character sequences that could be operators. For example, "+" is an operator not because it is built into the language, but because it is implemented as part of the standard library.

Second, there is syntax that determines whether an operator is a unary or binary operator. In A+B or A + B the "+" is a binary operator, in A * +B it is a unary operator. (A*+B would have a binary operator *+).

With this syntax, the operator is found, and then it is treated very much like a function call. If the operator isn't found, that's just like the function call syntax for an undefined function name. (I left out operator precedence and left or right associativity, which are defined per operator).

So you can easily define a unary operator *+ for example, and a binary operator *+ as well, and the compiler has no problem keeping them apart. (Swift beginners are often surprised that a+ b or a +b don't compile).

Licenciado em: CC-BY-SA com atribuição
scroll top