Question

I want a Bag container which hides its 'real' order from its clients.

It also must be fully polymorphic, that is shouldn't require any constraints over its element type.

I found at least three implementations of bags: Bag module from ghc package, Data.Bag from bag and Math.Combinatorics.Multiset from multiset-comb.

However, they all have toList and fold* operations which expose the internal order of elements which may depend on implementation details or the order of bag construction.

toList is impossible, at least with type Bag a -> [a]. However, folding does not always expose the order.

For example, fold (+) 0 does not expose.

The question is, how should I design the folding interface? Is there a necessary and sufficient condition for safety of the a -> a -> a folding function? As fmap does not expose the order, is there a loss of genericity from folding with a -> b -> b?

I'm thinking of commutative monoids - they seem sufficient, but I'm not sure if associativity and identity element are necessary.

Was it helpful?

Solution

An identity is probably necessary if your bags can be empty - you have to return something in that case, and if you want your fold to be a homomorphism (so that combining the results of folding some bags is the same as folding a bag made by combining the bags, which is a pretty natural property to expect), it must be an identity element.

Associativity, likewise, is a good idea. Suppose I have a type and operation like so:

data T a = Zero | One a | Two (T a) (T a)
  deriving (Eq, Ord, Show)

(+-+) :: Ord a => T a -> T a -> T a
Zero +-+ x = x
x +-+ Zero = x
a +-+ b = Two (min a b) (max a b)

Clearly (+-+) is commutative and has an identity, but is non-associative. Suppose I then implement a bag as a list:

newtype Bag a = Bag [a]

-- pre-condition: f is commutative and z is an identity for it
foldBag :: (a -> a -> a) -> a -> Bag a -> a
foldBag f z (Bag xs) = foldr f z xs

foldNonAssoc :: (Ord a) => Bag (T a) -> T a
foldNonAssoc = foldBag (+-+) Zero

Even if I demand the stated precondition, I can still use my foldNonAssoc to distinguish between Bag [One 1,One 2,One 3], which will fold to Two (One 1) (Two (One 2) (One 3)) and Bag [One 3,One 2,One 1], which will fold to Two (One 3) (Two (One 1) (One 2)) (notice that not all of the structure is preserved, but on a long list I'll get the entire list order back except for the ordering of the last two elements).

A priori, if you combine all your items with an operation, you'll have a tree of applications, something like a +-+ (b +-+ (c +-+ d)). Commutativity will let you do some rearrangement, but no matter what you do, c will always be combined with d. So if you want that to be the same as (a +-+ c) +-+ (b +-+ d), you really need associativity, too.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top