Alphabetical order in Chinese - java.text.Collator

Question 1

Why it is different? Because there are several different methods of sorting ideographic characters or even entire words. The ones that stuck in my mind are:

by number of strokes
by using Latin transliteration and then ordering it "naturally" (according to rules specific for Chinese language of course)

There are other methods as well, for example Unicode Technical Report #35 mentions some of them (more by coincidence, not necessary on purpose), but you'd have to have plenty of time to go through it.

To answer your question, on why these sorting orders are different, it just because Java contains its own collation rules and it does not rely on Operating System's ones (as Excel does). These rules might be different. You might also want to try out ICU, which is the source of classes and rules in Java (and is usually a step ahead than JDK).

Question 2

There isn't a Collator in Java 6 or 7 which will sort the Chinese in the same order as the first sample.

public static void main(String... args) {
    String text1 = "啊<波<词<的<俄<佛<歌<和<及<课<了<馍<呢<票<气<日<四<特<瓦<喜<以<只";
    findLocaleForSortedOrder(text1);
    String text2 = "啊<波<词<的<俄<佛<歌<和<及<课<了<呢<票<气<日<四<特<瓦<喜<以<只<馍";
    findLocaleForSortedOrder(text2);
}

private static void findLocaleForSortedOrder(String text) {
    System.out.println("For " + text + " found...");
    String[] preSorted = text.split("<");
    for (Locale locale : Collator.getAvailableLocales()) {
        String[] sorted = preSorted.clone();
        Arrays.sort(sorted, Collator.getInstance(locale));
        if (Arrays.equals(preSorted, sorted))
            System.out.println("Locale " + locale + " has the same sorted order");
    }
    System.out.println();
}

prints

For 啊<波<词<的<俄<佛<歌<和<及<课<了<馍<呢<票<气<日<四<特<瓦<喜<以<只 found...

For 啊<波<词<的<俄<佛<歌<和<及<课<了<呢<票<气<日<四<特<瓦<喜<以<只<馍 found...
Locale zh_CN has the same sorted order
Locale zh has the same sorted order
Locale zh_SG has the same sorted order