Question

I basically need to write a C preprocessor with Python, I've search around and since i need to fully custom my code and have a perfect understanding of what's going on, I better writting on my own.

Here's where I am : I first parse some header files (.h) to find #define keyword and build up a dictionnary with all the founded directives (with their values they have one). Then I need to parse source files (.c) depending of the directives I've found earlier. The mechanism I use at the moment to check if the code need to be processed is the following : I take all my define's name and their values and do a exec("define_name = define_value") (with the value '1' when not specified). Then to resolve a condition such as #if defined DEFINE_1 || defined DEFINE_2 && (DEFINE_3 == 10) .... I remove the C preprocessor keyword to make them Python style wich will produce DEFINE_1 or defined DEFINE_2 and (DEFINE_3 == 10).

And i finally use eval(...) on that string to find out the result.

THE QUESTION is I was wondering if the use of exec / eval is necessary and many people are somekind of reluctant to use them, is there a better solution?

Was it helpful?

Solution

Certainly the exec() isn't needed and shouldn't be used. I'm not even sure what you're expecting it to do anyway, as it would call a shell to set a variable that would only exist in the sub-shell.

And generally, you should avoid eval() statements as well as it's rarely the right thing to do.

So, what can you do?

1) first off, because programs can be written where one statement over-rides a previous statement, you can't pre-process the .h file (or even assume the #define's you're looking for are only in .h files in the first place) and have it work. Consider this:

#define foo 1
#if foo == 1
this line is true!
#endif
#define foo 0

If you pre-process everything you'll set "foo" to 1, then to 0 and then evaluate the #if later. You can't do that...

2) A more common thing to do would be to write a parser that goes line-by-line and deals with the contents of each line one at a time. This way you can even write a recursive function to deal with #include statements so you can start with just the .c file and let it pull in the right headers it's using, rather than requiring them to be specified in some other way.

In the end, you should end up with something like (in a function called "read_file"):

# ... file opening not shown ...

for line in file:
    includematch = re.match("#include\\s+\\"(.*)\\"", line)
    if match:
        # deal with an include statement by calling a function to process it
        read_file(includematch.group(1), definedict)

    definematch = re.match("#define\\s+(\\w+)\\s+(.*)")
    if definematch:
        # deal with define statements by saving it in a dict
        definedict[match.group(1)] = definedict[match.group(2)]

    #....

Obviously if I showed you the whole solution (and the above is hardly pretty code, but it's concise for showing purposes) I'd be solving your problem (homework?) for you. But the above is a better way to architect the whole thing than the path you were heading down.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top