جافا أداء تخصيص الذاكرة (صن أو. إس مباراة ويندوز)

https://stackoverflow.com/questions/1346513

20-09-2019
|

سؤال

ولدي وحدة اختبار بسيط جدا مجرد أن يخصص الكثير من الجمل:

public class AllocationSpeedTest extends TestCase {

    public void testAllocation() throws Exception {

        for (int i = 0; i < 1000; i++) {
            long startTime = System.currentTimeMillis();
            String a = "dummy";
            for (int j = 0; j < 1000; j++) {
                a += "allocation driven";
            }
            System.out.println(i + ": " + (System.currentTimeMillis() - startTime) + "ms " + a.length());
        }

    }

}

في بلدي جهاز كمبيوتر ويندوز (إنتل كور ديو، معالج 2.2GHz، 2GB) هذا يطبع في المتوسط:

...
71: 47ms 17005
72: 47ms 17005
73: 46ms 17005
74: 47ms 17005
75: 47ms 17005
76: 47ms 17005
77: 47ms 17005
78: 47ms 17005
79: 47ms 17005
80: 62ms 17005
81: 47ms 17005
...

في صن أو. إس (5.10 Generic_138888-03 sun4v سبارك SUNW، SPARC-مشروع-T5120):

...
786: 227ms 17005
787: 294ms 17005
788: 300ms 17005
789: 224ms 17005
790: 260ms 17005
791: 242ms 17005
792: 263ms 17005
793: 287ms 17005
794: 219ms 17005
795: 279ms 17005
796: 278ms 17005
797: 231ms 17005
798: 291ms 17005
799: 246ms 17005
800: 327ms 17005
...

وJDK الإصدار 1.4.2_18 على الجهازين. المعلمات JVM هي نفسها وهي:

–server –Xmx256m –Xms256m

ويمكن لأحد أن يفسر لماذا SUN الخادم فائقة أبطأ؟

و( http://www.sun.com/servers/coolthreads/ t5120 / performance.xml )

المحلول

وحدة المعالجة المركزية هي في الواقع أبطأ على SPARC (1.2GHZ)، وكما أجاب أحد المهندسين الشمس T2 هو غن 3 مرات أبطأ لتطبيق ترابط واحد من معالجات إنتل الحديثة. على الرغم من انه ذكر أيضا أنه في بيئة متعددة مترابطة يجب SPARC يكون أسرع.

ولقد جعلت اختبار متعددة الخيوط باستخدام مكتبة GroboUtils واختبار كل من المخصصات (من خلال تسلسالت) والعمليات الحسابية البسيطة (أ + = ي * ي) إلى معالج الاختبار. ولقد حصلت على النتائج التالية:

1 thread : Intel : Calculations test : 43ms
100 threads : Intel : Calculations test : 225ms

1 thread : Intel : Allocations test : 35ms
100 threads : Intel : Allocations test : 1754ms

1 thread : SPARC : Calculations test : 197ms
100 threads : SPARC : Calculations test : 261ms

1 thread : SPARC : Allocations test : 236ms
100 threads : SPARC : Allocations test : 1517ms

وSPARC يظهر قوته هنا متفوقا إنتل على 100 المواضيع.

وهنا يذهب اختبار متعددة الخيوط حساب:

import java.util.ArrayList;
import java.util.List;

import net.sourceforge.groboutils.junit.v1.MultiThreadedTestRunner;
import net.sourceforge.groboutils.junit.v1.TestRunnable;
import junit.framework.TestCase;

public class TM1_CalculationSpeedTest extends TestCase {

    public void testCalculation() throws Throwable {

        List threads = new ArrayList();
        for (int i = 0; i < 100; i++) {
            threads.add(new Requester());
        }
        MultiThreadedTestRunner mttr = new MultiThreadedTestRunner((TestRunnable[]) threads.toArray(new TestRunnable[threads.size()]));
        mttr.runTestRunnables(2 * 60 * 1000);

    }

    public class Requester extends TestRunnable {

        public void runTest() throws Exception {
            long startTime = System.currentTimeMillis();
            long a = 0;
            for (int j = 0; j < 10000000; j++) {
                a += j * j;
            }
            long endTime = System.currentTimeMillis();
            System.out.println(this + ": " + (endTime - startTime) + "ms " + a);
        }

    }

}

وهنا يذهب اختبار متعددة الخيوط تخصيص:

import java.util.ArrayList;
import java.util.List;

import junit.framework.TestCase;
import net.sourceforge.groboutils.junit.v1.MultiThreadedTestRunner;
import net.sourceforge.groboutils.junit.v1.TestRunnable;

public class TM2_AllocationSpeedTest extends TestCase {

    public void testAllocation() throws Throwable {

        List threads = new ArrayList();
        for (int i = 0; i < 100; i++) {
            threads.add(new Requester());   
        }
        MultiThreadedTestRunner mttr = new MultiThreadedTestRunner((TestRunnable[]) threads.toArray(new TestRunnable[threads.size()]));
        mttr.runTestRunnables(2 * 60 * 1000);

    }

    public class Requester extends TestRunnable {

        public void runTest() throws Exception {
            long startTime = System.currentTimeMillis();
            String a = "dummy";
            for (int j = 0; j < 1000; j++) {
                a += "allocation driven";
            }
            long endTime = System.currentTimeMillis();
            System.out.println(this + ": " + (endTime - startTime) + "ms " + a.length());
        }

    }

}

نصائح أخرى

وانها فهمي التي تهدف آلات أساس T2-ألترا في الأداء لكل واط بدلا من أداء الخام. كنت قد تحاول تقسيم الوقت المخصص من قبل استهلاك الطاقة ومعرفة أي نوع من الأرقام التي تحصل عليها. :)

هل هناك سبب كنت تقوم بتشغيل 1.4.2 بدلا من 1.6؟

والأجهزة صن أو. إس أبطأ، وVM قد يكون أبطأ إلى حد ما أيضا.

وأنا لا أعتقد أن هذا قياس تخصيص الذاكرة. لبداية، هناك عدد ضخم من الطابع نسخ يجري في a += "allocation driven";. ولكن أظن أن عنق الزجاجة الحقيقي هو في الحصول على الإخراج من System.out.println(...) من خلال طبقات الشبكة من التطبيق على الخادم الشمس إلى محطة العمل عن بعد.

وعلى سبيل التجربة، ومحاولة ضرب العد حلقة الداخلي بنسبة 10 و 100، وانظر إذا كان ذلك "يسرع" خادم الشمس بالنسبة إلى محطة العمل الخاصة بك.

وآخر شيء يمكن أن تحاول هو نقل حلقة داخلية في إجراء منفصل. ومن الممكن أنه منذ كنت تفعل كل عمل في الاحتجاج واحد من main، مترجم JIT لم يحصل على فرصة لترجمة عليه.

و(الاصطناعي "، المقاييس الصغيرة" مثل هذا هي دائما عرضة للتأثيرات مثل هذه. أنا أميل إلى عدم الثقة منها.)

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow