Skip Item in Dataflow TransformBlock
-
23-07-2021 - |
Question
TPL Dataflow provides a TransformBlock
for transforming input, e.g.:
var tb = new TransformBlock<int, int>(i => i * 2);
Is it possible to not output some of the input, e.g. if the input fails some validation test?
var tb = new TransformBlock<InputType, OutputType>(i =>
{
if (!ValidateInput(i))
{
// Do something to not output anything for this input
}
// Normal output
}
If that is not possible, what would be the best pattern to achieve that end?
Something like the following?
BufferBlock<OutputType> output = new BufferBlock<OutputType>();
var ab = new ActionBlock<InputType>(i =>
{
if (ValidateInput(i))
{
output.Post(MyTransform(i));
}
}
Solution
There are several options on how to do this:
- Use
TransformManyBlock
as Jon suggested and return a collection containing 1 or 0 items. - Use
TransformBlock
with some special value representing “no value” (e.g.null
) and then use aLinkTo()
with filter to remove those. You also have to link theTransformBlock
to null block (DataflowBlock.NullTarget<T>()
) without a filter, to drain the special values. I would consider this something of a hack, but you can also use the
Task
-based constructor ofTransformBlock
: useTask.FromResult()
when you want to return something andnull
when you don't. For example:new TransformBlock<int, int>(i => i % 2 == 0 ? Task.FromResult(i * 2) : null)
OTHER TIPS
I haven't used DataFlow myself, but I think you could use a TransformManyBlock
, and just make each step return either an empty collection or a single item.
var tmb = new TransformManyBlock<InputType, OutputType>(i =>
{
if (!ValidateInput(i))
{
return Enumerable.Empty<OutputType>();
}
...
// Or return new[] { outputValue };
return Enumerable.Repeat(outputValue, 1);
});
You could even potentially generalize this to a FilterBlock<T>
which just has a filter predicate, and passes appropriate matches through (just like Where
in LINQ). You could initially implement this using TransformManyBlock
as above, but then make it more efficient later.
A little bit old question, want to add some experience here: you can introduce a BufferBlock
instead of ActionBlock
for your data, and use LinkTo
extension method with condition predicate, so the valid values will proceed to the TransformBlock
, and invalid ones will be ignored. For discarding them you can simply use NullTarget
block, which simply ignores the data it receives. So the final code could look like this:
var input = new BufferBlock<int>();
var tb = new TransformBlock<int, int>(i => i * 2);
var output = new BufferBlock<int>();
// valid integers will pass to the transform
input.LinkTo(tb, i => ValidateInput(i));
// not valid will be discarded
input.LinkTo(DataflowBlock.NullTarget<int>());
// transformed data will come to the output
tb.LinkTo(output);
Also linking could be adjusted with some DataflowLinkOptions
with other LinkTo
overload.