As I have pointed in comment, Karatsuba's algorithm might help. But there's still a problem, which requires a separate solution.
Assume
A = (A1 << 32) + A2
B = (B1 << 32) + B2.
When we multiply those we get:
A * B = ((A1 * B1) << 64) + ((A1 * B2 + A2 * B1) << 32) + A2 * B2.
So we have 3 numbers we want to sum and one of this is definitely larger than 2^64 and another could be.
But it could be solved!
Instead of shifting by 64 bits once we can split it into smaller shifts and do modulo operation each time we shift. The result will be the same.
This will still be a problem if C itself is larger than 2^63, but I think it could be solved even in that case.