Question

I have been trying to track down weird problems with my mod_wsgi/Python web application. I have the application handler which creates an object and calls a method:

def my_method(self, file):
    self.sapi.write("In my method for %d time"%self.mmcount)
    self.mmcount += 1

    # ... open file (absolute path to file), extract list of files inside
    # ... exit if file contains no path/file strings
    for f in extracted_files:
        self.num_files_found += 1
        self.my_method(f)

At the start and end of this, I write

obj.num_files_found

To the browser.

So this is a recursive function that goes down a tree of file-references inside files. Any references in a file are printed and then those references are opened and examined and so on until all files are leaf-nodes containing no files. Why I am doing this isn't really important ... it is more of a pedantic example.

You would expect the output to be deterministic

Such as

Files found: 0
In my method for the 0 time
In my method for the 1 time
In my method for the 2 time
In my method for the 3 time
...
In my method for the n time
Files found: 128

And for the first few requests it is as expected. Then I get the following for as long as I refresh

Files found: 0
In my method for the 0 time
Files found: 128

Even though I know, from previous refreshes and no code/file alterations that it takes n times to enumerate 128 files.

So the question then: Does mod_wsgi/Python include internal optimizations that would stop complete execution? Does it guess the output is deterministic and cache?

As a note, in the refreshes when it is as expected, REMOTE_PORT increments by one each time ... when it uses a short output, the increment of REMOTE_PORT jumps wildly. Might be unrelated however.

I am new to Python, be gentle

Solved

Who knows what it was, but ripping out Apache, mod_python, mod_wsgi and nearly everything HTTP related and re-installing fixed the problem. Something was pretty broken but seems ok now :)

Was it helpful?

Solution

"Does mod_wsgi/Python include internal optimizations that would stop complete execution? Does it guess the output is deterministic and cache?"

No.

The problem is (generally) that you have a global variable somewhere in your program that is not getting reset the way you hoped it would.

Sometimes this can be unintentional, since Python checks local namespace and global namespace for variables.

You can -- inadvertently -- have a function that depends on some global variable. I'd bet on this.

What you're likely seeing is a number of mod_wsgi daemon processes, each with a global variable problem. The first request for each daemon works. Then your global variable is in a state that prevents work from happening. [File is left open, top-level directory variable got overwritten, who knows?]

After the first few, all the daemons are stuck in the "other" mode where they report the answer without doing the real work.

OTHER TIPS

That Apache/mod_wsgi may run in both multi process/multi threaded configurations can trip up code which is written with the assumption that it is run in a single process, with that process possibly being single threaded. For a discussion of different configuration possibilities and what that all means for shared data, see:

http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading

It seems the Python/mod_wsgi installation must be broken. I have never seen such weird bugs. Traces next to returns:

self.sapi.write("Returning at line 22 for call %d"%self.times_called)
return someval

Appear to happen numerous time:

Returning at line 22 for call 3

Returning at line 22 for call 3

Returning at line 22 for call 3

There is just no consistent logic in the control-flow of anything :( I am also pretty sure that I can write simple incrementing code to count the number of times a method is called. Absolute, frustrating, nonsense. I even put epoch time next to every call to sapi.write() to make sure that wasn't mindlessly repeating code. They are unique :S

Time to rip-out Apache, Python, mod_wsgi and the rest and start again.

Solved

Who knows what it was, but ripping out Apache, mod_python, mod_wsgi and nearly everything HTTP related and re-installing fixed the problem. Something was pretty broken but seems ok now :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top