Interpreting ARM/MachO with LLVM for analysis and optimization?

Question 1

For the purpose of static analysis of ARM binaries. It's is better to translate the semantics of each ARM instruction directly to LLVM IR and apply data-flow analysis on the later. For example, an ADD rd, rd, rm in ARM can be translated to LLVM IR %rd2 = add i32 %rd1, %rm1.

Decompilation of ARM machine code to C (for the purpose of recompiling it back to LLVM IR) is both cumbersome and unnecessary. Note that the focus of decompilers like IDA Pro is on binary understanding and not on recompilation per se. Therefore, you would have a hard time recompiling the software back, and even harder time linking your analysis results to the original binary.

The following links might be useful:

Fracture is an open source project attempting to directly translate ARM binaries to LLVM IR.
LLBT: is a research project that implemented ARM translation to LLVM IR. Their goal, however, is on static binary rewriting rather than binary analysis.

Note that you need a robust disassembler if you are considering analyzing stripped binaries. objdump can emit too much disassembly errors on binaries without symbols.

I'm in the early phases of a research project where we develop a processor description language that can make describing instruction semantics in LLVM IR easier. I'll update this answer when we have more results.

Question 2

For (1) - not within the framework of LLVM. There's no "decompiler" in there. You're free to use an external decompiler that translates machine code into C, and then compile that into LLVM IR with clang. YMMV with regards to the quality of such a translation, of course.

(1.5) If I understand what you're asking, then no. Instruction and MCInst are quite different animals, very far apart in their abstraction levels. Read this: http://eli.thegreenplace.net/2012/11/24/life-of-an-instruction-in-llvm/

(2) Yes, LLVM has an interpreter you can use from the lli tool. It directly "emulates" LLVM IR without lowering it.