Frage

I'm parsing some XML using Python's Expat (by calling parser = xml.parsers.expat.ParserCreate() and then setting the relevant callbacks to my methods).

It seems that when Expat calls read(nbytes) to return new data, nbytes is always 2,048. I have quite a lot of XML to process, and suspect that these small read()s are making the overall process rather slow. As a point of reference, I'm seeing throughput around 9 MB/s on an Intel Xeon X5550, 2.67 GHz running Windows 7.

I've tried setting parser.buffer_text = True and parser.buffer_size = 65536, but Expat is still calling the read() method with an argument of just 2,048.

Is it possible to increase this?

War es hilfreich?

Lösung

You're talking about the xmlparse.ParseFile method, right?

Unfortunately, no, that value is hardcoded as BUF_SIZE = 2048 in pyexpat.c.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top