Question

Was browsing the source code for sudo as provided on this site, and came across this super weird type signature (Bonus question: is there a more C-like term for "type signature"?) for main:

int
main(argc, argv, envp)
    int argc;
    char **argv;
    char **envp;
{

I understand that the style itself is the oldskool K&R. What I'm really interested in is the fact that main is taking a bonus argument, char **envp. Why? sudo is a fairly standard command line tool and is invoked as such. How does the operating system know what to do when it comes across a main function that isn't defined with the usual (int argc, char *argv[])?

Many a time I myself have been lazy and just left off the arguments completely and whatever program I was writing would appear to work just fine (most dangerous thing that can happen with C, I know :p)

Another part of my question is, what cool stuff does all this allow you to do? I have a hunch it helps a heap with embedded programming, but I've sadly had minimal exposure to that and can't really say. I would love to see some concrete examples

Was it helpful?

Solution

It's just a pointer to the environment, identical to

extern char **environ;

Both have been available in unix since Version 7 (in Version 6 there were no environment variables). The extern named environ got standardized; the third argument to main didn't. There's no reason to use 3-arg main except as some kind of fashion statement.

The process setup code that calls main doesn't need to know whether main expects 3 arguments, because there's no difference at the assembly level between a function that takes 2 arguments and a function that takes 3 arguments but doesn't use the third one. Or between a function that takes no arguments and a function that takes 2 arguments and doesn't use them, which is why int main(void) also works.

Systems with a not-unix-like ABI may need to know which kind of main they're calling.

Put this in one file:

#include <stdio.h>

int foo(int argc, char **argv)
{
  int i;
  for(i=0;i<argc;++i)
    puts(argv[i]);
  return 0;
}

And this in another:

extern int foo(int argc, char **argv, char **envp);
int main(int argc, char **argv)
{
  char *foo_args[] = { "foo", "arg", "another arg" };
  char *foo_env[] = { "VAR=val", "VAR2=val2" };
  foo(3, foo_args, foo_env);
  return 0;
}

This is completely wrong from a cross-platform language-lawyer standpoint. We've lied to the compiler about the type of foo and passed it more arguments than it wants. But in unix, it works. The extra argument just harmlessly occupies a slot on the stack that is properly accounted for and cleaned up by the caller after the function returns, or temporarily exists in a register where the callee doesn't expect to find anything in particular, and which the caller expects the callee to clobber so it doesn't mind if the register gets reused for another purpose in the callee.

That's exactly what happens to envp in a normal C program with a 2-arg main. And what happens to argc and argv in a program with int main(void).

But just like envp itself, there's no good reason to take advantage of this in serious code. Type checking is good for you, and knowing you can evade it should go along with knowing that you shouldn't.

OTHER TIPS

http://en.wikipedia.org/wiki/Main_function

From the very first paragraph:

Other platform-dependent formats are also allowed by the C and C++ standards, except that in C++ the return type must always be int;[3] for example, Unix (though not POSIX.1) and Microsoft Windows have a third argument giving the program's environment, otherwise accessible through getenv in stdlib.h:

Google is your friend. Also, the operating system doesn't need to know anything about the main in this case - it's the compiler that does the work, and as long as it's valid argument which are accepted by the compiler then there is no problem.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top