Вопрос

I'm parsing some XML using Python's Expat (by calling parser = xml.parsers.expat.ParserCreate() and then setting the relevant callbacks to my methods).

It seems that when Expat calls read(nbytes) to return new data, nbytes is always 2,048. I have quite a lot of XML to process, and suspect that these small read()s are making the overall process rather slow. As a point of reference, I'm seeing throughput around 9 MB/s on an Intel Xeon X5550, 2.67 GHz running Windows 7.

I've tried setting parser.buffer_text = True and parser.buffer_size = 65536, but Expat is still calling the read() method with an argument of just 2,048.

Is it possible to increase this?

Это было полезно?

Решение

You're talking about the xmlparse.ParseFile method, right?

Unfortunately, no, that value is hardcoded as BUF_SIZE = 2048 in pyexpat.c.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top