TL;DR: comparing the speed of code variants at just one input size is meaningless; comparing empirical orders of growth truly reflects algorithmic nature of the code and will be consistent across different test platforms, for the same test range of input sizes. Comparing absolute speed values is only meaningful for code variants which exhibit same asymptotic or at least local growth behaviour.
It is not enough to measure speed of your two implementations just at one input size. Usually several data points are needed, to assess the run time empirical orders of growth of our code (because the code can be run with varying input sizes). It is found as the logarithm of the ratio of run times, in base of the ratio of input sizes.
So even if at some input code_1
runs 10 times faster than code_2
, but its run time doubles with each doubling of the input size, whereas for code_2
it only grows as 1.1x, very soon code_2
will become much much faster than code_1
.
So the real measure of an algorithm's efficiency is its run time complexity (and the complexity of its space i.e. memory requirements). And when we measure it empirically, we only measure if for the particular code at hand (at a particular range of input sizes), not for the algorithm itself, i.e. the ideal implementation of it.
In particular, the theoretical complexity of trial division is O(n^1.5 / (log n)^0.5)
, in n primes produced, usually seen as ~ n^1.40..1.45
empirical order of growth (but it can be ~n^1.3
initially, for smaller input sizes). For the sieve of Eratosthenes it is O(n log n log (log n))
, seen usually as ~ n^1.1..1.2
. But there certainly are sub-optimal implementations of both the trial division and the sieve of Eratosthenes that run at ~n^2.0
and worse.
So no, this proves nothing. One datapoint is meaningless, at least three are needed to get a "big picture" i.e. to be able to predict with some certainty the run time ⁄ space needed for bigger input sizes.
Prediction with known certainty is what the scientific method is all about.
BTW your run times are very long. The calculation of 10,000 primes should be nearly instantaneous, much less than 1/100th of a second for a C program run on a fast box. Perhaps you're measuring printing time as well. Don't. :)