Are nested functions a bad thing in gcc ?

https://stackoverflow.com/questions/2929281

05-10-2019
|

문제

I know that nested functions are not part of the standard C, but since they're present in gcc (and the fact that gcc is the only compiler i care about), i tend to use them quite often.

Is this a bad thing ? If so, could you show me some nasty examples ? What's the status of nested functions in gcc ? Are they going to be removed ?

해결책

Nested functions really don't do anything that you can't do with non-nested ones (which is why neither C nor C++ provide them). You say you are not interested in other compilers - well this may be atrue at this moment, but who knows what the future will bring? I would avoid them, along with all other GCC "enhancements".

A small story to illustrate this - I used to work for a UK Polytechinc which mostly used DEC boxes - specifically a DEC-10 and some VAXen. All the engineering faculty used the many DEC extensions to FORTRAN in their code - they were certain that we would remain a DEC shop forever. And then we replaced the DEC-10 with an IBM mainframe, the FORTRAN compiler of which didn't support any of the extensions. There was much wailing and gnashing of teeth on that day, I can tell you. My own FORTRAN code (an 8080 simulator) ported over to the IBM in a couple of hours (almost all taken up with learning how to drive the IBM compiler), because I had written it in bog-standard FORTRAN-77.

다른 팁

There are times nested functions can be useful, particularly with algorithms that shuffle around lots of variables. Something like a written-out 4-way merge sort could need to keep a lot of local variables, and have a number of pieces of repeated code which use many of them. Calling those bits of repeated code as an outside helper routine would require passing a large number of parameters and/or having the helper routine access them through another level of pointer indirection.

Under such circumstances, I could imagine that nested routines might allow for more efficient program execution than other means of writing the code, at least if the compiler optimizes for the situation where there any recursion that exists is done via re-calling the outermost function; inline functions, space permitting, might be better on non-cached CPUs, but the more compact code offered by having separate routines might be helpful. If inner functions cannot call themselves or each other recursively, they can share a stack frame with the outer function and would thus be able to access its variables without the time penalty of an extra pointer dereference.

All that being said, I would avoid using any compiler-specific features except in circumstances where the immediate benefit outweighs any future cost that might result from having to rewrite the code some other way.

Like most programming techniques, nested functions should be used when and only when they are appropriate.

You aren't forced to use this aspect, but if you want, nested functions reduce the need to pass parameters by directly accessing their containing function's local variables. That's convenient. Careful use of "invisible" parameters can improve readability. Careless use can make code much more opaque.

Avoiding some or all parameters makes it harder to reuse a nested function elsewhere because any new containing function would have to declare those same variables. Reuse is usually good, but many functions will never be reused so it often doesn't matter.

Since a variable's type is inherited along with its name, reusing nested functions can give you inexpensive polymorphism, like a limited and primitive version of templates.

Using nested functions also introduces the danger of bugs if a function unintentionally accesses or changes one of its container's variables. Imagine a for loop containing a call to a nested function containing a for loop using the same index without a local declaration. If I were designing a language, I would include nested functions but require an "inherit x" or "inherit const x" declaration to make it more obvious what's happening and to avoid unintended inheritance and modification.

There are several other uses, but maybe the most important thing nested functions do is allow internal helper functions that are not visible externally, an extension to C's and C++'s static not extern functions or to C++'s private not public functions. Having two levels of encapsulation is better than one. It also allows local overloading of function names, so you don't need long names describing what type each one works on.

There are internal complications when a containing function stores a pointer to a contained function, and when multiple levels of nesting are allowed, but compiler writers have been dealing with those issues for over half a century. There are no technical issues making it harder to add to C++ than to C, but the benefits are less.

Portability is important, but gcc is available in many environments, and at least one other family of compilers supports nested functions - IBM's xlc available on AIX, Linux on PowerPC, Linux on BlueGene, Linux on Cell, and z/OS. See http://publib.boulder.ibm.com/infocenter/comphelp/v8v101index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Flanguage%2Fref%2Fnested_functions.htm

Nested functions are available in some new (eg, Python) and many more traditional languages, including Ada, Pascal, Fortran, PL/I, PL/IX, Algol and COBOL. C++ even has two restricted versions - methods in a local class can access its containing function's static (but not auto) variables, and methods in any class can access static class data members and methods. The upcoming C++ standard has lamda functions, which are really anonymous nested functions. So the programming world has lots of experience pro and con with them.

Nested functions are useful but take care. Always use any features and tools where they help, not where they hurt.

As you said, they are a bad thing in the sense that they are not part of the C standard, and as such are not implemented by many (any?) other C compilers.

Also keep in mind that g++ does not implement nested functions, so you will need to remove them if you ever need to take some of that code and dump it into a C++ program.

Nested functions can be bad, because under specific conditions the NX (no-execute) security bit will be disabled. Those conditions are:

GCC and nested functions are used
a pointer to the nested function is used
the nested function accesses variables from the parent function
the architecture offers NX (no-execute) bit protection, for instance 64-bit linux.

When the above conditions are met, GCC will create a trampoline https://gcc.gnu.org/onlinedocs/gccint/Trampolines.html. To support trampolines, the stack will be marked executable. see: https://www.win.tue.nl/~aeb/linux/hh/protection.html

Disabling the NX security bit creates several security issues, with the notable one being buffer overrun protection is disabled. Specifically, if an attacker placed some code on the stack (say as part of a user settable image, array or string), and a buffer overrun occurred, then the attackers code could be executed.

Late to the party, but I disagree with the accepted answer's assertion that

Nested functions really don't do anything that you can't do with non-nested ones.

Specifically:

TL;DR: Nested Functions Can Reduce Stack Usage in Embedded Environments

Nested functions give you access to lexically scoped variables as "local" variables without needing to push them onto the call stack. This can be really useful when working on a system with limited resource, e.g. embedded systems. Consider this contrived example:

void do_something(my_obj *obj) {
    double times2() {
        return obj->value * 2.0;
    }
    double times4() {
        return times2() * times2();
    }
    ...
}

Note that once you're inside do_something(), because of nested functions, the calls to times2() and times4() don't need to push any parameters onto the stack, just return addresses (and smart compilers even optimize them out when possible).

Imagine if there was a lot of state that the internal functions needed to access. Without nested functions, all that state would have to be passed on the stack to each of the functions. Nested functions let you access the state like local variables.

I agree with Stefan's example, and the only time I used nested functions (and then I am declaring them inline) is in a similar occasion.

I would also suggest that you should rarely use nested inline functions rarely, and the few times you use them you should have (in your mind and in some comment) a strategy to get rid of them (perhaps even implement it with conditional #ifdef __GCC__ compilation).

But GCC being a free (like in speech) compiler, it makes some difference... And some GCC extensions tend to become de facto standards and are implemented by other compilers.

Another GCC extension I think is very useful is the computed goto, i.e. label as values. When coding automatons or bytecode interpreters it is very handy.

Nested functions can be used to make a program easier to read and understand, by cutting down on the amount of explicit parameter passing without introducing lots of global state.

On the other hand, they're not portable to other compilers. (Note compilers, not devices. There aren't many places where gcc doesn't run).

So if you see a place where you can make your program clearer by using a nested function, you have to ask yourself 'Am I optimising for portability or readability'.

I'm just exploring a bit different kind of use of nested functions. As an approach for 'lazy evaluation' in C.

Imagine such code:

void vars()
{
  bool b0 = code0; // do something expensive or to ugly to put into if statement
  bool b1 = code1;

  if      (b0) do_something0();
  else if (b1) do_something1();
}

versus

void funcs()
{
  bool b0() { return code0; }
  bool b1() { return code1; }

  if      (b0()) do_something0();
  else if (b1()) do_something1();
}

This way you get clarity (well, it might be a little confusing when you see such code for the first time) while code is still executed when and only if needed. At the same time it's pretty simple to convert it back to original version.

One problem arises here if same 'value' is used multiple times. GCC was able to optimize to single 'call' when all the values are known at compile time, but I guess that wouldn't work for non trivial function calls or so. In this case 'caching' could be used, but this adds to non readability.

I need nested functions to allow me to use utility code outside an object.

I have objects which look after various hardware devices. They are structures which are passed by pointer as parameters to member functions, rather as happens automagically in c++.

So I might have

    static int ThisDeviceTestBram( ThisDeviceType *pdev )
    {
        int read( int addr ) { return( ThisDevice->read( pdev, addr ); }
        void write( int addr, int data ) ( ThisDevice->write( pdev, addr, data ); }
        GenericTestBram( read, write, pdev->BramSize( pdev ) );
    }

GenericTestBram doesn't and cannot know about ThisDevice, which has multiple instantiations. But all it needs is a means of reading and writing, and a size. ThisDevice->read( ... ) and ThisDevice->Write( ... ) need the pointer to a ThisDeviceType to obtain info about how to read and write the block memory (Bram) of this particular instantiation. The pointer, pdev, cannot have global scobe, since multiple instantiations exist, and these might run concurrently. Since access occurs across an FPGA interface, it is not a simple question of passing an address, and varies from device to device.

The GenericTestBram code is a utility function:

    int GenericTestBram( int ( * read )( int addr ), void ( * write )( int addr, int data ), int size )
    {
        // Do the test
    }

The test code, therefore, need be written only once and need not be aware of the details of the structure of the calling device.

Even wih GCC, however, you cannot do this. The problem is the out of scope pointer, the very problem needed to be solved. The only way I know of to make f(x, ... ) implicitly aware of its parent is to pass a parameter with a value out of range:

     static int f( int x ) 
     {
         static ThisType *p = NULL;
         if ( x < 0 ) {
             p = ( ThisType* -x );
         }
         else
         {
             return( p->field );
         }
    }
    return( whatever );

Function f can be initialised by something which has the pointer, then be called from anywhere. Not ideal though.

Nested functions are a MUST-HAVE in any serious programming language.

Without them, the actual sense of functions isn't usable.

It's called lexical scoping.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow