If you don't want to lose overflow information, you should first move int8x8_t
to int16x8_t
then do the summing.
If you want result to saturate then you should use vqadd.
Vector saturating add: vqadd -> Vr[i]:=sat<size>(Va[i]+Vb[i])
If you just want to convert C version you should use vhadd or vrhadd (rounds) which does halving the sum instead of trying to do shift as a second step.
Vector halving add: vhadd -> Vr[i]:=(Va[i]+Vb[i])>>1
Vector rounding halving add: vrhadd -> Vr[i]:=(Va[i]+Vb[i]+1)>>1