سؤال

How do I achieve parallel filtering in F#? Basically I want to do Array.Parallel.Choose except with a sequence expression to create an Array.

I tried:

Async.Parallel [for x in 1..40 do yield async { if x % 5 = 0 then return x }]

but this is a type mismatch since the if doesn't always have a value (i.e. Async<unit> isn't an Async<'a>).

I'm iterating over a large set of numbers (1,000,000,000 +) so I don't want to generate the sequence upfront.

The actual check in the if statement is:

let isPalindrome (x : int) = let numberArray = x.ToString().ToCharArray()
                             numberArray = Array.rev numberArray

Tried using PSeq:

[for x in 990000..999999 do for y in 990000..999999 do yield (x, y, x*y)]
|> PSeq.filter(fun (x, y, z) -> isPalindrome z)
|> Seq.max

this results in an OutOfMemoryException.

Ugly workaround:

let numberArray = [|990000..999999|]
let result = numberArray |> Array.Parallel.collect(fun x -> [| for y in numberArray do if isPalindrome (x*y) then yield (x, y, x*y)|])
             |> Array.maxBy(fun (x, y, z) -> z)
هل كانت مفيدة؟

المحلول

let isPalindrome (x : int) = let numberArray = x.ToString().ToCharArray()
                             numberArray = Array.rev numberArray

I figured out how to use the PSeq library to do lazy filtering on a sequence:

let generator =
    seq {
        for x in 990000..999999 do 
            for y in 990000..999999 do 
                yield (x, y, x*y) 
    }

generator
|> PSeq.filter(fun (x, y, z) -> isPalindrome z)
|> Seq.max

Real: 00:00:28.938, CPU: 00:01:49.078, GC gen0: 3448, gen1: 2, gen2: 0

Sadly, this was slower than my ugly hack:

let numberArray = [|990000..999999|]
let result = numberArray |> Array.Parallel.collect(fun x -> [| for y in numberArray do if isPalindrome (x*y) then yield (x, y, x*y)|])
             |> Array.maxBy(fun (x, y, z) -> z)

Real: 00:00:24.779, CPU: 00:01:32.109, GC gen0: 2680, gen1: 18, gen2: 1

نصائح أخرى

A much more efficient solution based on the implementation of Array.Choose:

 open System.Collections.Concurrent
 open System.Threading.Tasks
 open System


 let parallelFilter f (array: 'T[]) = 

            let inputLength = array.Length                      

            let isChosen : bool [] = Array.zeroCreate inputLength                
            let mutable outputLength = 0        
            let range = Partitioner.Create(0,inputLength)
            Parallel.ForEach(
                         range,
                         (fun () ->0),
                         (fun (start,finish) _ count -> 
                            let mutable localCount = 0
                            for i in start .. finish-1 do
                                match f array.[i] with 
                                | true -> () 
                                | false -> 
                                    isChosen.[i] <- true                                      
                                    localCount <- localCount+1
                            localCount),
                          Action<int> (fun x -> System.Threading.Interlocked.Add(&outputLength,x) |> ignore )
                          ) |> ignore         

            let output = Array.zeroCreate outputLength
            let mutable curr = 0
            for i = 0 to isChosen.Length-1 do 
                if isChosen.[i] then 
                    output.[curr] <- output.[i]
                    curr <- curr + 1
            output
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top