Got this from the Intel Forums (answered by Evgueni Petrov):
__m512d V1 = (__m512d)_mm512_extload_epi32(&Addr, _MM_UPCONV_EPI32_NONE, _MM_BROADCAST_4X16, _MM_HINT_NONE);
where 'Addr' is the address of the location in memory, from which we loaded the doubles into vector 'A'.
We can do a similar operation for V2,V3,V4, by using &(Addr+2), &(Addr+4) and &(Addr+6) respectively.