Question

I've found recently that most of the errors in my C++ programs are of a form like the following example:

#include <iostream>

class Z
{
 public:
 Z(int n) : n(n) {}
 int n;
};

class Y
{
 public:
 Y(const Z& z) : z(z) {}
 const Z& z;
};

class X
{
 public:
 X(const Y& y) : y(y) {}
 Y y;
};

class Big
{
 public:
 Big()
 {
   for (int i = 0; i < 1000; ++i) { a[i] = i + 1000; }
 }
 int a[1000];
};

X get_x() { return X(Y(Z(123))); }

int main()
{
 X x = get_x();
 Big b;
 std::cout << x.y.z.n << std::endl;
}

OUTPUT: 1000

I would expect this program to output 123 (the value of x.y.z.n set in get_x()) but the creation of "Big b" overwrites the temporary Z. As a result, the reference to the temporary Z in the object Y is now overwritten with Big b, and hence the output is not what I would expect.

When I compiled this program with gcc 4.5 with the option "-Wall", it gave no warning.

The fix is obviously to remove the reference from the member Z in the class Y. However, often class Y is part of a library which I have not developed (boost::fusion most recently), and in addition the situation is much more complicated than this example I've given.

This there some sort of option to gcc, or any additional software that would allow me to detect such issues preferably at compile time, but even runtime would be better than nothing?

Thanks,

Clinton

Was it helpful?

Solution

I submitted such cases on the clang-dev mailing list a few months ago, but no one had the time to work on it back then (and neither did I, unfortunately).

Argyrios Kyrtzidis is currently working on it though, and here is his last update on the matter (30 Nov 23h04 GMT):

I reverted the previous commit, much better fix in http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20101129/036875.html. e.g. for

struct S {   int x; };

int &get_ref() {   S s;   S &s2 = s;   int &x2 = s2.x;   return x2; }

we get

t3.cpp:9:10: warning: reference to stack memory associated with local variable 's' returned
  return x2;
         ^~
t3.cpp:8:8: note: binding reference variable 'x2' here
  int &x2 = s2.x;
       ^    ~~
t3.cpp:7:6: note: binding reference variable 's2' here
  S &s2 = s;
     ^    ~
1 warning generated.

The previous attempt failed the self-hosting test, so I hope this attempt will pass. I'm pretty glad Argyrios is looking into it anyway :)

It isn't perfect yet, admittedly, as it's quite a complicated problem to tackle (reminds me of pointer aliasing in a way), but this is nonetheless a great step in the right direction.

Could you test your code against this version of Clang ? I'm pretty sure Argyrios would appreciate the feedback (whether it is detected or not).

OTHER TIPS

[Edited third bullet to demonstrate a technique that may help] This is the rabbit hole you go down when a language permits passing arguments by value or reference with the same caller syntax. You have the following options:

  • Change the arguments to non-const references. A temporary value will not match a non-const reference type.

  • Drop the references altogether in cases where this is a possibility. If your const references don't indicate logically shared state between caller and callee (if they did this problem wouldn't occur very frequently), they were probably inserted in an attempt to avoid naive copying of complex types. Modern compilers have advanced copy ellision optimizations that make pass-by-value as efficient as pass-by-reference in most cases; see http://cpp-next.com/archive/2009/08/want-speed-pass-by-value for a great explanation. Copy ellision clearly won't be performed if you're passing the values on to external library functions that might modify the temporaries, but if that were the case then you're either not passing them in as const references or deliberately casting away the const-ness in the original version. This is my preferred solution as it lets the compiler worry about copy optimization and frees me to worry about other sources of error in the code.

  • If your compiler supports rvalue references, use them. If you can at least edit the parameter types of the functions where you worry about this problem, you can define a wrapper metaclass like so:

template < typename T > class need_ref {

T & ref_;

public:

need_ref(T &&x) { /* nothing */ }

need_ref(T &x) : ref_(x) { /* nothing */ }

operator T & () { return ref_; }

};

and then replace arguments of type T& with arguments of type need_ref. For example, if you define the following

class user {

int &z;

public:

user(need_ref< int> arg) : z(arg) { /* nothing */ }

};

then you can safely initalize an object of type user with code of the form "int a = 1, b = 2; user ua(a);", but if you attempt to initialize as "user sum(a+b)" or "user five(5)" your compiler should generate an uninitialized reference error inside the first version of the need_ref() constructor. The technique is obviously not limited to constructors, and imposes no runtime overhead.

The problem here is the code

 Y(const Z& z) : z(z) {}

as the member 'z' is initialized with a reference to the formal parameter 'z'. Once the constructor returns the reference refers to an object which is no longer valid.

I don't think the compiler will or can in many cases detect such logical flaws. The fix then IMO is obviously to be aware of such classes and use them in a manner appropriate to their design. This should really be documented by the library vendor.

BTW, it is better to name the member 'Y::z' as 'Y::mz' if possible. The expression 'z(z)' is not very appealing

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top