O3 will only use instructions that are specified when compiling (default specified when building toolchain, if none of switches explained here is specified). It will just try to optimize more aggresively (as specified here). Most optimizations are actually done in compiler "middle" end before code is even converted into target machine specific form.
So you can combine any -O with any instruction set extension by using those two groups of switches.