Question

rmdup :: [Int] -> [Int]
rmdup [] = []
rmdup (x:xs) | x `elem` xs = rmdup xs
             | otherwise = x: rmdup xs

The code above removes duplicate from a list of Integer but it removes the first occurrence and keeps the second one. For instance:

rmdup [1,2,3,1,4]

will result:

[2,3,1,4]

How can I change it to keep the order and yield this: [1,2,3,4]? Note, I don't want to use built-in functions.

Was it helpful?

Solution

How about the following? This avoids the crazily inefficient acc ++ [x] and also to reverse the given list twice:

rmdup :: Eq a => [a] => [a]
rmdup xs = rmdup' [] xs
  where
    rmdup' acc [] = []
    rmdup' acc (x:xs)
      | x `elem` acc = rmdup' acc xs
      | otherwise    = x : rmdup' (x:acc) xs

OTHER TIPS

One way to achieve what you want is to pass the input list in the reverse order and once when the computation is finished then reverse the result again. Although, this solution is not efficient.

rmdup :: [Int] -> [Int]
rmdup xs = reverse $ rmdup' (reverse xs)
  where
    rmdup' [] = []
    rmdup' (x:xs) | x `elem` xs = rmdup' xs
                  | otherwise = x: rmdup' xs

Demo:

ghci> rmdup [1,2,3,1,4]
[1,2,3,4]

You want to ignore those later occurrences of an element if you saw it before, then you need to record what you have seen, looks like foldl or foldl' is what you are looking for.

Here is a possible implementation:

import Data.List (foldl')

rmdup :: (Eq a) => [a] -> [a]
rmdup = foldl' step []
    where step acc x
            | x `elem` acc = acc
            | otherwise = acc++[x]

Since elem is O(n), the solutions based on using it to check each element are O(n^2). The "standard" efficient solution to the duplicates problem is to sort the list before checking for duplicates. Here, since we need to preserve elements, we have to be a bit more careful.

import Data.List
import Data.Ord

rmdupSorted :: Eq b => [(a,b)] -> [(a,b)]
rmdupSorted (x@(_,xb):xs@((_,yb):_)) | xb == yb  = rmdupSorted xs
                                     | otherwise = x : rmdupSorted xs
rmdupSorted xs = xs      -- 0 or 1 elements

rmdup :: Ord a => [a] -> [a]
rmdup = map snd . sort . rmdupSorted . sortBy (comparing snd) . zip [0..]

main = print $ rmdup [1,2,3,4,5,4,6,1,7]

Assuming that the sortBy function is a stable sort, the rmdup function will remove all the duplicate occurrences of any element but for the one occurring last. If sortBy is not stable, then rmdup will remove all the occurrences but for an unspecified one (i.e., rmdup [1,2,1] could return [1,2] instead of [2,1].).

Complexity is now O(n log n).

We now need to rewrite the above without library functions, as the OP requested. I will leave this as an exercise to the reader. :-P

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top