If I need to use a piece of memory throughout the lifespan of my program, is it really necessary to free it right before program termination?

https://softwareengineering.stackexchange.com/questions/305930

10-12-2020
|

Question

In many books and tutorials, I've heard the practice of memory management stressed and felt that some mysterious and terrible things would happen if I didn't free memory after I'm done using it.

I can't speak for other systems (although to me it's reasonable to assume that they adopt a similar practice), but at least on Windows, the Kernel is basically guaranteed to cleanup most resources (with the exception of an odd few) used by a program after program termination. Which includes heap memory, among various other things.

I understand why you would want to close a file after you're done using it in order to make it available to the user or why you would want to disconnect a socket connected to a server in order to save bandwidth, but it seems silly to have to micromanage ALL your memory used by your program.

Now, I agree that this question is broad since how you should handle your memory is based on how much memory you need and when you need it, so I will narrow the scope of this question to this: If I need to use a piece of memory throughout the lifespan of my program, is it really necessary to free it right before program termination?

Edit: The question suggested as a duplicate was specific to the Unix family of operating systems. Its top answer even specified a tool specific to Linux (e.g. Valgrind). This question is meant to cover most "normal" non-embedded operating systems and why it is or isn't a good practice to free memory that is needed throughout the lifespan of a program.

Solution

If I need to use a piece of memory throughout the lifespan of my program, is it really necessary to free it right before program termination?

It is not mandatory, but it can have benefits (as well as some drawbacks).

If the program allocates memory once during its execution time, and would otherwise never release it until the process ends, it may be a sensible approach not to release the memory manually and rely on the OS. On every modern OS I know, this is safe, at the end of the process all allocated memory is reliably returned to the system.
In some cases, not cleaning up the allocated memory explicitly may even be notably quicker than doing the clean-up.

However, by releasing all the memory at end of execution explicitly,

during debugging / testing, mem leak detection tools won't show you "false positives"
it might be much easier to move the code which uses the memory together with allocation and deallocation into a separate component and use it later in a different context where the usage time for the memory need to be controlled by the user of the component

The lifespan of programs can change. Maybe your program is a small command line utility today, with a typical lifetime of less than 10 minutes, and it allocates memory in portions of some kb every 10 seconds - so no need to free any allocated memory at all before the program ends. Later on the program is changed and gets an extended usage as part of a server process with a lifetime of several weeks - so not freeing unused memory in between is not an option any more, otherwise your program starts eating up all available server memory over time. This means you will have to review the whole program and add deallocating code afterwards. If you are lucky, this is an easy task, if not, it may be so hard that chances are high you miss a place. And when you are in that situation, you will wish you had added the "free" code to your program beforehand, at the time when you added the "malloc" code.

More generally, writing allocating and related deallocating code always pairwise counts as a "good habit" among many programmers: by doing this always, you decrease the probability of forgetting the deallocation code in situations where the memory must be freed.

OTHER TIPS

Freeing memory at the end of a programs run is just a waste of CPU time. It's like tidying a house before nuking it from orbit.

However sometimes what was a short running program can turn into part of a much longer running one. Then freeing stuff becomes necessary. If this wasn't thought about to at least some extent then it can involve substantial reworking.

One clever solution to this is "talloc" which lets you make a load of memory allocations and then throw them all away with one call.

You could use a language with garbage collection (such as Scheme, Ocaml, Haskell, Common Lisp, and even Java, Scala, Clojure).

^{(In most GC-ed languages, there is no way to explicitly and manually free memory! Sometimes, some values might be finalized, e.g. the GC and the runtime system would close a file handle value when that value is unreachable; but this is not sure, unreliable, and you should instead close explicitly your file handles, since finalization is never guaranteed)}

You could also, for your program coded in C (or even C++) use Boehm's conservative garbage collector. You would then replace all your malloc with GC_malloc and not bother about free-ing any pointer. Of course you need to understand the pros and conses of using Boehm's GC. Read also the GC handbook.

^{Memory management is a global property of a program. In some way it (and the liveness of some given data) is non-compositional and non-modular, since a whole program property.}

At last, as others pointed out, explicitly free-ing your heap allocated C memory zone is good practice. For a toy program not allocating a lot of memory, you could even decide to not free memory at all (since when the process has ended, its resources, including its virtual address space, would be freed by the operating system).

If I need to use a piece of memory throughout the lifespan of my program, is it really necessary to free it right before program termination?

No, you don't need to do that. And many real world programs don't bother freeing some memory which is needed on the entire lifespan (in particular the GCC compiler is not freeing some of its memory). However, when you do that (e.g. you don't bother free-ing some particular piece of C dynamically allocated data), you'll better comment that fact to ease the work of future programmers on the same project. I tend to recommend that the amount of unfreed memory stays bounded, and usually relatively small w.r.t. to total used heap memory.

Notice that the system free often does not release memory to the OS (e.g. by calling munmap(2) on POSIX systems) but usually mark a memory zone as reusable by future malloc. In particular, the virtual address space (e.g. as seen thru /proc/self/maps on Linux, see proc(5)....) might not shrink after free (hence utilities like ps or top are reporting the same amount of used memory for your process).

It is not necessary, as in you will not fail to execute your program properly if you fail to. However, there are reasons you may choose to, if given a chance.

One of the most powerful cases I run into (over and over) is that someone writes a small piece of simulation code that runs in their executable. They say "we'd like this code to be integrated into the simulation." I then ask them how they plan to re-initialize between monte-carlo runs, and they look at me blankly. "What do you mean re-initialize? You just run the program with new settings?"

Sometimes a clean cleanup makes it much easier for your software to be used. In the case of many examples, you presume you never need to clean something up, and you make assumptions about how you can handle the data and its lifespan around those presumptions. When you move to a new environment where those presumptions are not valid, entire algorithms may no longer work.

For an example of just how strange things can get, look at how the managed languages deal with finalization at the end of processes, or how C# deals with the programmatic halting of Application Domains. They tie themselves in knots because there's assumptions that fall through the cracks in these extreme cases.

Ignoring languages where you don't free memory manually anyway...

What you think of as "a program" right now might at some point become just a function or method that is part of a larger program. And then that function might get called multiple times. And then memory that you should have "freed manually" will be a memory leak. That is of course a judgment call.

Because it's very probable that in some time you'll want to alter your program, maybe integrate it with something else, run several instances in sequence or parallel. Then it'll become necessary to manually free this memory - but you won't be remembering the circumstances anymore and it'll cost you much more time to re-understand your program.

Do things while your understanding about them is still fresh.

It's a small investment that may yield big returns in the future.

No, it is not necessary, but it's a Good Idea.

You say you feel that "mysterious and terrible things would happen if I didn't free memory after I'm done using it."

In technical terms, the only consequence of this is that your program will keep eating more memory till either a hard limit is reached (e.g. your virtual address space is exhausted) or performance becomes unacceptable. If the program is about to exit, none of this matters because the process effectively ceases to exist. The "mysterious and terrible things" are purely about the developer's mental state. Finding the source of a memory leak can be an absolute nightmare (this is an understatement) and it takes a lot of skill and discipline to write code that is leak-free. The recommended approach to developing this skill and discipline is to always free memory once it is no longer needed, even if the program is about to end.

Of course this has the added advantage that your code can be re-used and adapted more easily, as others have said.

However, there is at least one case where it is better not to free memory just before program termination.

Consider the case where you have made millions of small allocations, and they have mostly been swapped to disk. When you start freeing everything, most of your memory needs to get swapped back in to RAM so that the bookkeeping information can be accessed, only to immediately discard the data. This can make the program take a few minutes just to exit! Not to mention put a lot of pressure on the disk and physical memory during that time. If physical memory is in short supply to begin with (perhaps the program is being closed because another program is chewing up a lot of memory) then individual pages may need to be swapped in and out several times when several objects need to be freed from the same page, but are not freed consecutively.

If instead the program just aborts, the OS will simply discard all the memory that has been swapped to disk, which is nearly instantaneous because it doesn't require any disk access at all.

It's important to note that in an OO language, calling an object's destructor will also force the memory to be swapped in; if you have to do this you might as well free the memory.

The main reasons to clean up manually are: you’re less likely to mistake a genuine memory leak for a memory block that will just be cleaned up on exit, you will sometimes catch bugs in the allocator only when you walk the entire data structure to deallocate it, and you’ll have a much smaller headache if you ever refactor so that some object does need to get deallocated before the program exits.

The main reasons not to clean up manually are: performance, performance, performance, and the possibility of some kind of free-twice or use-after-free bug in your unnecessary clean-up code, which can turn into a crash bug or security exploit.

If you care about performance, you always want to profile to find out where you’re wasting all your time, instead of guessing. A possible compromise is to wrap your optional clean-up code in a conditional block, leave it in when debugging so you get the benefits of writing and debugging the code, and then, if and only if you’ve determined empirically that it’s too much overhead, tell the compiler to skip it in your final executable. And a delay in closing the program is, almost by definition, hardly ever in the critical path.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange