Before smart pointers came into being

https://stackoverflow.com/questions/12333292

30-06-2021
|

Question

Before smart pointers (capable of taking ownership of resources in the dynamic region and freeing them after use) came into being, I wonder how bookkeeping on dynamically created objects was performed when passed as arguments to functions that took resource pointers.

By bookkeeping, I mean that if there is a "new" then at some point later there should be a "delete" following it. Otherwise, the program will suffer from a memory leak.

Here is an example with B being a class and void a_function(B*) being a third party library function:

void main() {

  B* b = new B(); // line1

  a_function(b);  // line2

  ???             // line3
}

What do I do in line 3? Do I assume that the third party function has taken care of de-allocating the memory? If it has not and I assume that it has, then my program suffers from a memory leak. But, if it de-allocates the memory occupied by b and I too do it in main() so as to be on the safe side, then b actually ends up being freed twice! My program will crash due to a double-free error!

Solution

Okay, staying off the impending discussion of why this isn't relevant and you should be using smart pointers anyway...

All other things being equal (no custom allocators or anything fancy like that) the rule is whoever allocates the memory should deallocate the memory. Third-party functions, such as that in your example, should absolutely never be deallocating memory that it didn't create, mainly because 1) it's bad practice in general (terrible code smell) and more importantly 2) it doesn't know how the memory was allocated to start with. Imagine the following:

int main()
{
    void * memory = malloc(sizeof(int));
    some_awesome_function(memory);
}

// meanwhile, in a third-party library...

void some_awesome_function(void * data)
{
    delete data;
}

What happens if malloc/free and new/delete are operating using different allocators? You're looking at a potential error of some sort because the allocator used for delete has no idea what to do with memory that was allocated by malloc's allocator. You never free memory that was new'd, and you never delete memory that was malloc'd. Ever.

As for the first point, the fact that you have to ask what would happen if a third-party library deallocated memory and you tried to (or didn't try to) manually free it is exactly why things shouldn't be done that way: because you simply have no way of knowing. So, it's accepted practice that whatever portion of code is responsible for allocation is also responsible for deallocation. If everyone sticks to this rule, everyone can keep track of their memory and nobody is left guessing.

OTHER TIPS

The two core language features that enable "smart pointers", and more generally the idiom of scope-bound resource management (SBRM, sometimes also onomatopoeically referred to as RAII, for "resource acquisition is initialization"), are:

destructors (automatic gotos)
unconstrained variables (every object can occur as a variable)

Both these are fundamental core features of C++ and have always been part of the language. Therefore, smart pointers have been always been imlpementable in C++.

[Incidentally, those two features mean that goto is necessary in C to handle resource allocation and multiple exits in a systematic, general fashion, while they are essentially forbidden in C++. C++ absorbs goto into the core language.]

Like with any language, it takes a long time before people learn, understand and adopt the "correct" idioms. Especially given the historic connections of C++ with C, lots of programmers who were and are working on C++ projects have come from a C background and have presumably found it more comfortable to stick with familiar patterns, which are still supported by C++ even though those are not advisable ("just replace malloc with new everyone and we'll be ready to ship").

You destroy what you create, the library destroys what it creates.

If you share data with the library (for example, a char* for file data), the library's documentation will specify if it keeps a reference to your data (in which case don't delete your copy until the library is done using it) or makes a copy of your data (in which case it's the library's job to delete the data when done).

I see a lot people pointing out that smart pointers have been around from the beginning of C++. But the fact is that not all code is using them, even today. A common approach is to do reference counting manually:

void main() {
  B* b = createB(); //refcount = 1 
  a_function(b);
  releaseB(b); //--refcount
}

void a_function(B* b) {
  acquireB(b); //refcount++ when we store the reference somewhere
  ...
}

smart pointers is a way to ease the implementation of a policy. The same policies (attributing the responsibility to delete to one owner or a set of them) were used. You just had to document the policy and not forget to act accordingly. Smart pointers are both a way to document the chosen policy and to implement it at the same time. In your case, you looked at a_function documentation and saw what it demanded. Or took a more or less educated guess if it wasn't documented.

The answer is in the documentation of third party function a_function(). Possible cases could be:

the function just uses data in the object, and will not keep references to it after the function call ended (example: printf). You can safely delete the object after the function call ended.
the function (in some internal library object) will keep a reference to the object until a later call (let's say to b_function()). You are responsible for deletion of the object, but have to keep it alive until you call b_function (example:strtok).
the function takes ownership of the object, and doesn't guarantees the object existence after it's called (example: free()). In this case, the documentation usually specifies how to create the object (malloc, new, my_library_malloc).

These are only some example of many different behavior which could be possible, but as long as the function is documented well enough you should be able to do the right thing.

What do I do in line 3?

You consult the documentation of a_function. The usual rule is that functions do nothing about ownership or lifetime unless they say they do. The need for such documentation is pretty clearly established by reference to C APIs, where smart pointers aren't available.

So, if it doesn't say that it deletes its parameter, then it doesn't. If it doesn't say that it keeps a copy of its parameter beyond the time when it returns, until some other specified time, then it doesn't.

If it says something you act accordingly, and if it says nothing then you delete b (or preferably you write B b; a_function(&b); instead -- observe that by not destroying the object, the function doesn't need to care how you create the object, you're free to decide).

Hopefully it says whatever it says explicitly, but if you're unlucky it says it via some convention that certain kinds of function in an API take ownership of the objects referred to by their parameters. For example if it's called set_global_B_instance then you might have a sneaking suspicion that it's going to keep that pointer around, and deleting it immediately after setting it would be unwise.

If it doesn't say anything either way, but your code ends up buggy and you eventually discover that a_function called delete on its argument, then you find whoever documented a_function and you ~~punch them in the nose~~ submit a bug on their documentation.

Frequently that person turns out to be yourself, in which case try to learn the lesson -- document object ownership.

As well as helping to avoid coding errors, smart pointers provide some degree of self-documentation for functions that accept or returns pointers where there are ownership concerns. In the absence of self-documentation, you have actual documentation. For example if a function returns auto_ptr instead of a raw pointer, that tells you delete needs to be called on the pointer. You can let the auto_ptr do that for you, or you can assign it to some other smart pointer, or you can release() the pointer and manage it yourself. The function you called doesn't care, and doesn't need to document anything much. If a function returns a raw pointer then it has to tell you something about the lifetime of the object referred to by the pointer, because there's no way for you to guess.

Just look to C APIs for hints. It's pretty common for C APIs to provide explicit create and destroy functions. These typically follow some formal naming convention in the libraries.

Using your example, it would be a bad design if a_function deletes/frees the parameter if it were not explicitly labeled as a destroy function (in which case, you should not use that parameter after calling the function. In most cases, it is a bad design to assume that it is safe to destroy objects you do not own. Of course, with smart pointers, mechanisms of ownership, lifetimes, and cleanup are often handled by the smart pointer where possible.

So yes, people used new and delete, and although I wasn't writing C++ before templates -- it would have been more common to see explicit new and delete in programs. Smart pointers are not a very good means to transfer objects and convey ownership without templates -- which, along with exceptions, were introduced around 1990 (7 years after C++ was available). Naturally, it took some time for compilers to support all these features, and for people to implement containers and improve on those implementations. Note that it was possible before templates, but it wasn't always practical to implement/clone a container for arbitrary types because the language did not support generics well prior to templates. Of course, a concrete class with concrete types could easily accomplish the mechanics of smart pointers where the type was invariant in those days… but that does result in forms of code duplication when generics are not available.

But even today, it's an unusual design for a smart pointer parameter's content object to be replaced or destroyed, unless clearly labeled. The likelihood of this is also decreased because it's also unusual to pass the smart pointer as the parameter, rather then the object it holds. So the number of memory related bugs have decreased since then, but some caution and good ownership conventions should still be observed.

Simple answer: read the docs. This is a common thing in C interfaces, and because resource management is an important part of the interface if the function claims ownership of the object it will be documented.

Without smart pointers, programmers generally adopt the rule that the entity that allocated is responsible for deallocating.

In your example, it would typically be considered Bad Behavior (despite being valid code) for your third party function to delete the pointer passed to it, and you would be expected to delete it in line 3.

This is a social contract between programmers, and a compiler would typically not enforce this.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow