Question

In my algorithms class, we have to turn in algorithms for removing duplicates of a list of integers and shoot for the lowest possible complexity. In my algorithm, when I see a duplicate integer, I shift every element after that integer down one index in order to delete the duplicated element using a for-loop; like so:

for(int i=dup_index; i<arr_size-1; i++)
{
  arr[i] = arr[i+1];
}

Would it be more efficient for my algorithm to use memmove? Furthermore if it's my job to design the algorithm, and assuming memmove lowers the complexity of my algorithm, could using memmove be seen as 'cheating'?

Was it helpful?

Solution

I don't know about cheating, but memmove basically does what your loop does, only more efficiently.

Besides, it's one of the most basic utility functions, so I don't see why you should not use it.

As for complexity, the order of the algorithm will not change, it will simply be faster. memmove is implemented in assembly and tries to make full use of the alignment to copy word per word instead of byte per byte.

EDIT:

well okay, there might be cases where the manual copy might be a couple of instructions shorter than the call to memmove, but if you start moving data around in memory, you're doing an inherently costly operation, so optimizing a couple of CPU cycles away will not make any difference in the big picture.

If your design involves in-place moves for performance-critical data, you would be better off changing the underlying data structure for something that avoids copies altogether (list, tree, hash table, whatever).

OTHER TIPS

Your program might run faster using memmove as far as wall clock time, but your for-loop basically achieves the same thing. The algorithm complexity won't change.

Functionally, memmove does exactly what your loop does. However, the compiler and/or C runtime may implement memmove using more efficient machine code. Some architectures have specialized instructions which will perform such an operation entirely in a single instruction. It may be inlined by the compiler also.

It does not change the complexity of the algorithm, it just gets it done faster.

As to whether it would be considered cheating in your assignment - that is a question for your professor, surely? No-one here can tell you that.

If there are multiple duplicates than you can use a auxillary to store non duplicates :-

int aux[arr_size],cnt=0;

for(int i=0;i<arr_size;i++) {

   if(arr[i]!=dup) {

         aux[cnt++] = arr[i];
   }
}

If you need inplace :-

int i = dup_index;
int j = dup_index+1;
int dup = arr[i];

while(j<arr_size) {

  if(arr[j]!=dup) {

      arr[i++] = arr[j];

  }

  j++;

}

Library functions (like memmove(3)) are carefully optimized. If you look at how your favorite compiler writes that out into assembly language for various lengths (given as constants) you'll probably get a huge surprise.

That said, copying $n$ bytes from one place to the other will take time $\Theta(n)$, unless you know something that helps avoid part of the shuffling data around that doesn't just cut it down by a fixed fraction.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top