I used to think this was impossible too, but there was the nagging thought that the Y combinator could be put to good use, it's 2021 now, and constexpr helps create a language within a language that's possibly better than Haskell. Plus it's at the base of a class of problems that occurs often when writing functional language compilers ...
To work towards an optimal solution takes some lateral thinking as to how to crack this chicken/egg situation:
Firstly we can't use auto or templates in local classes, so is there anything we can do instead? We can define lambdas that are functions the type parameter they are sent, ignoring any instance data, allowing them to be constexpr. This means we can define a function that creates the object B, given the type of A ... but we are not finished yet, we still have a data segment to contend with. The class creation function needs to know about this, or we have to explicitly have a using A::i to get the data from the A segment. Thus, just as in assembly code, we have to separate the data and code segments, and since the data has simpler typing, we place it first in the list of dependencies, and derive it twice as a virtual base class, one of the few (only?) valid uses I've found for a virtual base class (a pattern best avoided, but here, unavoidable?).
Then we essentially work with the first iteration of the Y combinator. Inside the context of A::foo, we use the class extender crB to create the precise type of the B object, and casts this to a pointer of that type to "call" it.
Here's where it gets interesting. If the code is sufficiently well typed, the compiler can deduce we are intending a mutual tail recursion, and elide the call to a jmp, a process that is critical for correctly targeted functional code.
Using std::function, which is implemented using a compiler unfriendly virtual function call, scuppers any chances of this optimization, so perhaps this method is more friendly, since the compiler has access to all the types involved, without any indirection?
#include <iostream>
int main(int argc, char* argv[])
{
struct Data
{
int i = 563;
};
constexpr auto crB = [](auto par)
{
using T = decltype(par);
struct B : public T, virtual public Data
{
void foo() {
std::cerr << "B: " << i << "\n";
i = (3 * i) + 1;
T::foo();
}
};
return B();
};
struct A : virtual public Data
{
void foo()
{
std::cerr << "A: " << i << "\n";
i >>= 1;
if (i == 1)
return;
using ForwardT = decltype(crB(A()));
(i&1)? static_cast<ForwardT *>(this)->foo() : foo();
};
};
auto binst = crB(A());
binst.foo();
return 0;
}
So what does this high level assembly language compile to?
.LC0:
.string "B: "
.LC1:
.string "\n"
.LC2:
.string "A: "
main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B::foo():
push rbx
mov rbx, rdi
.L3:
mov esi, OFFSET FLAT:.LC0
mov edi, OFFSET FLAT:_ZSt4cerr
call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
mov rdi, rax
mov rax, QWORD PTR [rbx]
mov rax, QWORD PTR [rax-24]
mov esi, DWORD PTR [rbx+rax]
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov esi, OFFSET FLAT:.LC1
mov rdi, rax
call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
mov rax, QWORD PTR [rbx]
mov rdx, QWORD PTR [rax-24]
add rdx, rbx
imul eax, DWORD PTR [rdx], 3
inc eax
mov DWORD PTR [rdx], eax
.L4:
mov esi, OFFSET FLAT:.LC2
mov edi, OFFSET FLAT:_ZSt4cerr
call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
mov rdi, rax
mov rax, QWORD PTR [rbx]
mov rax, QWORD PTR [rax-24]
mov esi, DWORD PTR [rbx+rax]
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov esi, OFFSET FLAT:.LC1
mov rdi, rax
call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
mov rax, QWORD PTR [rbx]
mov rdx, QWORD PTR [rax-24]
add rdx, rbx
mov eax, DWORD PTR [rdx]
sar eax
mov DWORD PTR [rdx], eax
cmp eax, 1
je .L1
test al, 1
je .L4
jmp .L3
.L1:
pop rbx
ret
main:
sub rsp, 24
mov rdi, rsp
mov QWORD PTR [rsp], OFFSET FLAT:vtable for main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B+24
mov QWORD PTR [rsp+8], 563
call main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B::foo()
xor eax, eax
add rsp, 24
ret
_GLOBAL__sub_I_main:
push rax
mov edi, OFFSET FLAT:_ZStL8__ioinit
call std::ios_base::Init::Init() [complete object constructor]
mov edx, OFFSET FLAT:__dso_handle
mov esi, OFFSET FLAT:_ZStL8__ioinit
pop rcx
mov edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
jmp __cxa_atexit
typeinfo for main::Data:
.quad vtable for __cxxabiv1::__class_type_info+16
.quad typeinfo name for main::Data
typeinfo name for main::Data:
.string "*Z4mainE4Data"
typeinfo for main::A:
.quad vtable for __cxxabiv1::__vmi_class_type_info+16
.quad typeinfo name for main::A
.long 0
.long 1
.quad typeinfo for main::Data
.quad -6141
typeinfo name for main::A:
.string "*Z4mainE1A"
typeinfo for main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B:
.quad vtable for __cxxabiv1::__vmi_class_type_info+16
.quad typeinfo name for main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B
.long 2
.long 2
.quad typeinfo for main::A
.quad 2
.quad typeinfo for main::Data
.quad -6141
typeinfo name for main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B:
.string "*ZZ4mainENKUlT_E_clIZ4mainE1AEEDaS_E1B"
vtable for main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B:
.quad 8
.quad 0
.quad typeinfo for main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B
.quad 0
.quad typeinfo for main::{lambda(auto:1)#1}::operator()<main::A>(main::A) const::B
All the tail recursive calls have been elided. The translation is essentially perfect.