Question

TPL Dataflow provides a TransformBlock for transforming input, e.g.:

var tb = new TransformBlock<int, int>(i => i * 2);

Is it possible to not output some of the input, e.g. if the input fails some validation test?

var tb = new TransformBlock<InputType, OutputType>(i =>
{
    if (!ValidateInput(i))
    {
        // Do something to not output anything for this input
    }
    // Normal output
}

If that is not possible, what would be the best pattern to achieve that end?
Something like the following?

BufferBlock<OutputType> output = new BufferBlock<OutputType>();

var ab = new ActionBlock<InputType>(i =>
{
    if (ValidateInput(i)) 
    {
        output.Post(MyTransform(i));
    }
}
Was it helpful?

Solution

There are several options on how to do this:

  1. Use TransformManyBlock as Jon suggested and return a collection containing 1 or 0 items.
  2. Use TransformBlock with some special value representing “no value” (e.g. null) and then use a LinkTo() with filter to remove those. You also have to link the TransformBlock to null block (DataflowBlock.NullTarget<T>()) without a filter, to drain the special values.
  3. I would consider this something of a hack, but you can also use the Task-based constructor of TransformBlock: use Task.FromResult() when you want to return something and null when you don't. For example:

    new TransformBlock<int, int>(i => i % 2 == 0 ? Task.FromResult(i * 2) : null)
    

OTHER TIPS

I haven't used DataFlow myself, but I think you could use a TransformManyBlock, and just make each step return either an empty collection or a single item.

var tmb = new TransformManyBlock<InputType, OutputType>(i =>
{
    if (!ValidateInput(i))
    {
        return Enumerable.Empty<OutputType>();
    }
    ...
    // Or return new[] { outputValue };
    return Enumerable.Repeat(outputValue, 1);
});

You could even potentially generalize this to a FilterBlock<T> which just has a filter predicate, and passes appropriate matches through (just like Where in LINQ). You could initially implement this using TransformManyBlock as above, but then make it more efficient later.

A little bit old question, want to add some experience here: you can introduce a BufferBlock instead of ActionBlock for your data, and use LinkTo extension method with condition predicate, so the valid values will proceed to the TransformBlock, and invalid ones will be ignored. For discarding them you can simply use NullTarget block, which simply ignores the data it receives. So the final code could look like this:

var input = new BufferBlock<int>();
var tb = new TransformBlock<int, int>(i => i * 2);
var output = new BufferBlock<int>();

// valid integers will pass to the transform
input.LinkTo(tb, i => ValidateInput(i));

// not valid will be discarded
input.LinkTo(DataflowBlock.NullTarget<int>());

// transformed data will come to the output
tb.LinkTo(output);

Also linking could be adjusted with some DataflowLinkOptions with other LinkTo overload.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top