I assume there is a way to compare an element x
from the first list with an element y
from the second list in a way to find out if they refer to the same element, beyond the gang point.
- Compare
x1
from the first list againsty1
andy2
, the first two elements of the second list. - Compare
y1
withx1
andx2
. - Compare
x2
withy2 ... y4
. - Compare
y2
withx2 ... x4
. - Compare
x4
withy4 ... y8
. - ...
In general:
for q in 1, 2, 4, 8, 16, ...
Compare x_q with y_q ... y_2q
Compare y_q with x_q ... x_2q
Basically, in each step you double the search range. Let's say m <= n
. Then at some point there is m<q
and n-m<q
. Now you will find a match x_q==y_(q+n-m)
. This match is somewhere beyond the gang point, from there you just have to go back to the gang point. This is O(n+m)
.
Edit OK, the go back part doesn't work with a singly linked list. However, this is a minor technical issue, because at this point we know the difference n-m
and can start over and search again from the beginning, comparing x_k
with y_(k+n-m)
for all k=1,2,3,...
until we find a match.
Edit2 The Complexity, a rough overview of the proof. For each q
we do 2*q
comparisons. Let's say we find a match for q=q1
. Then summing over all q
up to q1
gives 4*q1
comparisons altogether, sum of the geometric series. Now we just need to prove q1=O(n+m)
. We know there is no match up to q=q1/2
, this means either m>q1/2
or n>2*q1/2
, because otherwise there would be a match. Together this can be writen as n+m>q1/2
or q1<2*(n+m)
. For the number of comparisons we get 4*q1<8*(n+m)
, which is well within O(n+m)
. Here we always assume n>=m
, without loss of generality. Finally, we have to add the go back part, but this is linear as well, so the total algorithm is O(n+m)
because each part is.