you have to use BX or BLX or depending on the arm pop (ldm). Depending on your linker you can leave it like this and the linker will add a ConvolveC4_from_arm for example which basically sets the link register to the return address after the bl, but the trampoline switches modes.
The other approach is unless you are calling functions in the same source and you know the mode of, then always use blx or bx.
the gnu tools, binutils, can take care of some of this for you if you declare the labels/functions properly.