Question

I have applied the simulated annealing algorithm (in Java) to a personal project that I am working on, but was wondering if the SA algorithm would perform slightly better on the same dataset if written in another language (i.e. C++, Python, etc).

The dummy dataset that I have created consists of five thousand users' names, postcodes and dates of birth.

The SA algorithm is applied to the dataset in order to extract many different pieces of information.

At present, my most recent test is attempting to get the SA algorithm to detect all users whose birth dates fall within one week of each other (in any given year). Now, the SA algorithm works very well indeed; however, as I am a perfectionist, I would like to achieve slightly faster results and want to know if anybody has had any good experiences with SA producing excellent results on a similar-sized dataset, but written in other languages?

At the moment, the SA algorithm takes just under five seconds to perform a successful search.

Was it helpful?

Solution

I would write it in Java

public class UserSearchMain {
    static class Person {
        int id;
        int dateOfBirth;

        static Person[] generatePeople(int num) {
            Random rand = new Random();
            Person[] people = new Person[num];
            for (int i = 0; i < num; i++) {
                Person p = new Person();
                p.id = i;
                p.dateOfBirth = rand.nextInt(80 * 365); // any day for 80 years.
                people[i] = p;
            }
            return people;
        }
    }

    public static void main(String... args) {
        Person[] people = Person.generatePeople(5000);
        List<List<Person>> peopleByDOB = new ArrayList<List<Person>>();
        for (Person person : people) {
            int dob = person.dateOfBirth;
            while (peopleByDOB.size() <= dob) peopleByDOB.add(new ArrayList<Person>());
            peopleByDOB.get(dob).add(person);
        }

        Random rand = new Random();

        int warmup = 12 * 1000;
        int runs = 1000 * 1000;
        long start = 0;

        for (int i = -warmup; i < runs; i++) {
            if (i == 0) start = System.nanoTime();
            int dayToSearch = rand.nextInt(80 * 365);
            for (int j = Math.max(0, dayToSearch - 7); j <= dayToSearch + 7 && j < peopleByDOB.size(); j++) {
                List<Person> peopleWithCloseDOM = peopleByDOB.get(j);
            }
        }
        long time = System.nanoTime() - start;
        System.out.printf("Each search took an average of %,d ns%n", time / runs);
    }
}

prints

Each search took an average of 85 ns

Often the algorithim you use is more important than the choice of language.

EDIT: In answer to your original question, could you make this faster in C++ with the same algorithim? I would guess yes, but not by alot.


To speed it up further you could use multiple threads.

public static void main(String... args) throws ExecutionException, InterruptedException {
    Person[] people = Person.generatePeople(5000);
    final List<List<Person>> peopleByDOB = new ArrayList<List<Person>>();
    for (Person person : people) {
        int dob = person.dateOfBirth;
        while (peopleByDOB.size() <= dob) peopleByDOB.add(new ArrayList<Person>());
        peopleByDOB.get(dob).add(person);
    }

    final int runs = 10 * 1000 * 1000;
    long start = System.nanoTime();

    int processors = Runtime.getRuntime().availableProcessors();
    ExecutorService service = Executors.newFixedThreadPool(processors);
    List<Future> futures = new ArrayList<Future>();
    for (int i = 0; i < processors; i++) {
        futures.add(service.submit(new Runnable() {
            @Override
            public void run() {
                Random rand = new Random();
                for (int i = 0; i < runs; i++) {
                    int dayToSearch = rand.nextInt(80 * 365);
                    for (int j = Math.max(0, dayToSearch - 7); j <= dayToSearch + 7 && j < peopleByDOB.size(); j++) {
                        List<Person> peopleWithCloseDOM = peopleByDOB.get(j);
                    }
                }
            }
        }));
    }
    for (Future future : futures) {
        future.get();
    }
    service.shutdown();
    double timeSec = (System.nanoTime() - start) / 1e9;
    double rate = processors * runs / timeSec / 1e6;
    System.out.printf("The search throughput was %.1f million per second%n", rate);
}

prints

The search throughput was 32.4 million per second

which implies an average of about 31 ns per search.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top