Question

I have an application meant for data transfer between 2 databases. Most of the operations of this application are independent and runs concurrently. Earlier this application was running on 4 core intel machine and now this application needs to be ported onto AMD quad(4) core machine. I am doubtful about couple of points below.

  1. I found AMD does not support hyper threading(HTT), this obviously means application performance (throughput) will degrade. Will performance degrade due to Context Switching, If yes will decreasing number of threads running concurrently help ??

  2. Whether any code changes are required from my side to increase application throughput.

Was it helpful?

Solution 2

Java was made to hardware agnostic. You should not be concerned about what are the features provided by CPU.

BTW, performance improvement as a result of HTT is always been very limited for most of benchmarks(5-10%).

Remember: Not every manufacturer has it and not every processor has it.

As far as performance of your DB is concerned: you should think about maximizing parallelism and minimize context switch.

OTHER TIPS

Instead of hyperthreading, AMD took an alternative route as of Bulldozer called (by some) clustering.
As explained in the link MinGW brought, this means that a single AMD core can now sustain 2 integer "HW threads" (much like HT) + one floating point dedicated one. Note that unlike HT which shares all core resources between the HW threads, only the frontend (instruction fetch & decode) is shared in this scheme. The backend is duplicated, meaning that you should be able to get 2x resource BW than HT if you were backend-bound (execution was taking most of the time for you), and roughly the same as HT if you're frontend bound (for e.g. you have a complex control flow with multiple branches).

Notice the following quote saying pretty much the same:

All else being the same, it should give you more threaded performance than a single SMT (Hyper Threaded) core but less than two dedicated cores

So essentially each HW thread now is more than a single Intel HW thread, but less than a full intel core. You can either consider it as a super HW thread, or a lame core, depending on your personal preference.

However, and this is a big "however", AMD used to cheat a bit here - they published core counts based on these "super" threads and not the actual combos (newly dubbed as a "module"). This means that a 4-core AMD machine actually has 2 modules, with 4 super threads, and would therefore have the same HW thread count as a 2-core Intel machine (although with stronger threads), but half of the threads on a 4-core Intel machine with HT enabled. You did not specify which machine you intend to use so make sure the core count has the right meaning.

The performance may vary as I said above - for execution intensive workloads you may see similar results between 4-core AMD and 4-core Intel since you have the same number of parallel pipelines, and HT may not help Intel much (although "may" is used here in a very wide sense - a better comparison would take into account the sizes of different buffers on each machine, number of parallel ALUs and ports, issue width, etc..). On the other hand, on branchy or memory intensive workloads, where you tend to get stuck a lot waiting for data/branch resolutions - Intel can pull in the extra 4 HW threads in parallel without any overhead for context switching, getting more work done.

I personally think AMD chips are actually great value for multi-threading.

How Piledriver architecture works: http://www.anandtech.com/show/3863/a...t-chips-2010/4

how hyperthreading works: http://en.wikipedia.org/wiki/Hyper-threading

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top