Optimization using VBO in OpenGL ES 2.0

Question 1

Let's first reason about it with some simple mathematics:

At the moment you don't need to push any vertex data onto the GPU (each frame), but 12-16 floats matrix data per quad, and perform a matrix-matrix multiplication per quad on the CPU.
When putting all in one VBO, you have to transfer 4 vertices (~12 floats) per quad, but no matrix data (except for the global VP, of course) and you have to do 4 matrix-vector multiplies (~1 matrix-matrix multiply) on the CPU.

So the amount of work and data transferred doesn't really change much. But what changes is, that the transferred data is shifted from many many small uniform updates to a single large VBO update, which is very likely to be faster (both because a buffer update is likely to be faster from the hardware side than multiple uniform updates, but don't nail me on that, and second because of the much reduced driver overhead). And on top of that comes the even more reduced overhead by using a single large draw call instead of many smaller.

So yes, it will certainly be worth a try, though it has to be evaluated if it is really a "significant" improvement in your particular application.

Question 2

Would there be a significant performance gain to storing all the necessary values in a VBO and just calling glDrawElements once during each pass?

Yes, it would be much faster. First reason, as you correctly identified, will be a single glDrawElements call. And second being the fact that VBO keeps the data in the GPU itself.

If quads move out of scope you can reuse their memory for the new quads. VBO's can be used to draw subregions of the buffer, so you can get big flexibility without memory allocations.

By using VBO's you are minimising interaction with the GPU and so getting the performance benefit.