質問

I use a simulation written in python/numpy/cython. Since i need to average over many simulation runs i use the multiprocessing module to run all the individual simulation runs in batches.

At the office i have an i7-920 workstation with HT. At home i have an i5-560 without. I thought i could run twice as many instances of the simulation in each batch in the office and cut my running time in half. Surprisingly, the run time of each individual instance was doubled compared to the time it take on my home workstation. That it, running 3 simulation instances in parallel at home would take, say 8 minutes, while running 6 instances at the office take about 15 minutes. Using 'cat /proc/cpuinfo' i verified 'siblings' = 8 and 'cpu cores' = 4, so HT is enabled.

I am not aware of any "conservation of total runtime" law (though from s scientific point of view it could quite interesting :) ), and hopping someone here might shed some light on this conundrum.

役に立ちましたか?

解決

Maybe the context switches produce more overhead, caused by 6 massivly calculating processes and only 4 real cores. If the processes compete for the cpu-ressources, they may use inefficient the cpu-caches.

If you only enable 4 instead of 6 core, what's the result?

他のヒント

Hyperthreading may be good for some kinds of workload. Intense numeric computations is not one of these - when you want to do some number crunching you better turn off hyperthreading. What hyperthreading gives one is "free context switching" between tasks, but the CPU has only so many execution units.

In this case, it can make things worse, because the O.S. can't know which processes are running on separate cores (where they'd get full performance), and which are on the same core, just on different "hyperthreads".

(Actually, I'd bet the Linux kernel can provide a way for one to have fine control over that, but Python's multiprocessing module will just launch extra-processes which will use default resource allocation).

Bottomline: turn HT off if you can - at least you will make full use of the 4 cores.

The others have pretty much given you an insight on the problem, I just want to contribute by linking this article that explains a bit more about how HT works and what are the implications for the performance of a multithreaded program: http://software.intel.com/en-us/articles/performance-insights-to-intel-hyper-threading-technology/

with my HP workstation(16 cores/cpu,using hyper-threading comes to 32 processors), turning hyper-threading on even broke python when I run the numerical simulation,the error code is 0x000005 this puzzled me a long time until I turned HT off,and the simulation works well! maybe you could check and compare the run-time for both HT is on and off

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top