Pergunta

I need to use VirtualAlloc to allocate executable memory for my project to JIT re-compile a custom script format to x86/etc. but I feel confused where everyone else seems oblivious and there seems to be a distinct lack of detail in it's behaviour.

I understand that it allocates 'virtual' memory meaning it could be anything physically (RAM/disk), but on use it can simply be considered 'memory'. But if, for example, I do something like:

#define MB 1024*1024
auto pAddr = VirtualAlloc(NULL, 8*MB, MEM_RESERVE, PAGE_NOACCESS);
VirtualAlloc(pAddr + 4*MB, 1*MB, MEM_COMMIT, PAGE_EXECUTE_READWRITE);  // commit 1MB, 4MB's into the reserved memory

Is that just 1MB used, or 5MB? Obviously, I'm not expecting it to be 5MB - I'm just given no clue what to expect in this seemingly obvious scenario. Is it valid? Can any range within the reserved memory be committed and de-committed freely? What's more, can it be used out-of-order or should it be allocated incrementally (which going by MSDN documentation, is all it looks like you can do with it). Or is VirtualAlloc only happy with 'pages' being allocated at a time?

Every example I have found only seems interested in showing me how to allocate pages - which is probably just the most basic use possible but far from the most practical - but I want to use this for allocating compiled code for scripts which may be re-compiled during execution occasionally. I need to try and make some sort of interface for these allocations so I can simply say "gimme some memory for this script compilation" and it will automatically return previously committed space that isn't used or commit some new space - so any tips on how best to allocate from virtual memory (e.g. is it best to not de-commit memory which may likely be committed again?) would also be appreciated.

Foi útil?

Solução

OK; I think I understand what you're getting at and I hope this will clarify things.

Conceptually, VirtualAlloc works on individual pages.

For simplicity, let's consider a 32-bit x86 process. The virtual address space is a sequence of pages from page 0 through to page 1048575. Each of these pages may or may not be reserved; if reserved, it may or may not be committed. (If committed, it will also have zero or more memory-protection options, and there are various other states a page might be in, but we can pretty much ignore all that for now.)

There is no way for only part of a page to be reserved or committed, or for two parts of the same page to have different memory-protection options. Conversely, it doesn't matter whether the reserved and/or committed pages are consecutive.

If you call VirtualAlloc with a particular starting address and region size, then it acts on every page containing one or more bytes within the virtual address region specified. The address and size are used only to calculate which pages to act on. The only reason that the argument is an address rather than a page number is to simplify things for the programmer.

Conceptually, a single call to VirtualAlloc covering multiple pages is equivalent to calling VirtualAlloc once for each of those pages. The only difference (other than efficiency) is that acting on multiple pages at once is atomic, so will either fail or succeed for the entire range.

Note in particular that if you successfully make multiple calls to VirtualAlloc covering consecutive pages, there is no way to tell afterwords that the pages were allocated separately. The operating system only remembers what state a page is in, not how it got there. [Addendum: Oops; this is wrong. The documentation for VirtualQuery says that it can tell whether consecutive pages were part of the same allocation or not. Perhaps they're tagged with a unique allocation ID or something. I don't believe this information is actually used by the memory manager at all, but apparently it is kept.]

Keep in mind that the HeapCreate function already allows you to create a heap whose memory blocks allow code execution. Unless your application has very unusual needs it is unlikely that you will gain anything by writing your own heap manager.

Outras dicas

It's pages only.

The history of VirtualAlloc (a Windows API function) and its concept of memory pages is tied to the evolution of the Intel x86 processor family.

With the original 8086 processor (cirka 1979 IIRC, with the first PC based on it in 1980?) addressing at the machine code level was 16-bit. By itself that would have yielded a maximum 64K address range, but the processor treated each address as only an offset into a 64K segment of memory. It supported four logical segment, namely code, data, stack and an "extra" segment, and which of these an address was an offset into depended on the context. The start of each segment in physical memory was determined by 16-bit segment selector registers, called respectively CS, DS, SS and ES. To form the start address of a segment the processor simply shifted the selector 4 bits to the left, corresponding to a multiplication by 16. Thus the *physical address corresponding to an offset O used in a context where the segment selector was S, was A = 16*S + O, which made for roughly 20-bit physical addresses (but you could wrap around a little at the top).

        logical address = (segment offset, segment selector)
        calculation( offset, selector ) → physical address

With the 80286 the segment selectors were mapped to physical addresses via tables, which allowed for much more memory, and allowed segments of different sizes. I think it had 24-bit physical addresses, not sure; it was still an essentially 16-bit processor, but supporting (relatively speaking) lots of memory. With the 286 it made sense to talk about the offsets as logical addresses, because a process couldn't just calculate the corresponding physical address.

        logical address = (segment offset, segment selector)
        lookup( offset, selector ) → physical address

The 80386 finally brought 32-bit programming to the PC. A code segment could be 32-bit or 16-bit, determining the interpretation of the machine code there. Moreover it supported transparent automatic memory management, with virtual memory (simulating real memory by using disk storage). To do that it added another layer of address translation, based on fixed size pages. So a program's logical address was first treated as a segment offset, 286-style, and the resulting middle-level address was then decomposed into a memory page selector plus page offset, with each memory page of fixed size. The idea being that fixed size pages could be efficiently swapped to disk, and back.

        logical address = (segment offset, segment selector)
        lookup( soffset, sselector ) → intermediate address

        intermediate address = (page offset, page selector)
        lookup( poffset, pselector ) → physical address or "page not mapped"

For this to be meaningful you need to have far more possible logical pages than there are pages of physical memory. For if not then every logical page could just be mapped to its own physical page, and swapping would have no benefit. And this means that committing logical pages, i.e. mapping them to physical pages, can fail.

These are the pages handled by VirtualAlloc and family. You can allocate logical pages. And you map them to physical pages by committing, which, due to the purpose of the whole scheme, can necessarily be done on single pages: a page is the unit of memory management at the hardware level.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top