I'm going to answer my own question here.
It is my conclusion that what I was trying to achieve just doesn't fit the way OpenGL is meant to work; the computation I wanted to automate simply depends on too many factors to be done that way.
Just to clarify: what I intend to do is providing OpenGL bindings for Node (the command-line JavaScript interpreter based on Google's V8 virtual machine). As some OpenGL (or extension) calls return significant amounts of data and/or that data can have a non-trivial memory layout, I was hoping to help programmers by allocating the required output buffers through auto-generated code based on Khronos' parseable specification files, the same that are already used by projects such as GLEW to generate their extended C bindings.
Based on the answers I received (thank you all!), that idea was naive - and the intent not as helpful as I thought, because, thinking about it, simply having a buffer of the right size might avoid access violations, but does not help a programmer use the information he obtained. In the end, he still needs to know the exact memory layout anyway, so allocating the buffer won't be an issue for him (well, in theory at least).
In light of all this, computing output buffer sizes, as well as handling their content, is better left to the next-higher software layer. And because no-one to my knowledge uses the whole of the OpenGL API in the same piece of software, there is no actual need for a single library to handle all possible output buffer allocations.
Thanks to all who responded!