How to write a PowerShell advanced function that can work with both piped in objects and objects get from parameter value?

StackOverflow https://stackoverflow.com/questions/20886024

Question

I'm writing a function Chunk-Object that can chunk an array of objects into sub arrays. For example, if I pass it an array @(1, 2, 3, 4, 5) and specify 2 elements per chunk, then it will return 3 arrays @(1, 2), @(3, 4) and @(5). Also the user can provide an optional scriptblock parameter if they want to process each elements before chunk them into sub arrays. Now my code is:

function Chunk-Object()
{
    [CmdletBinding()]
    Param(
        [Parameter(Mandatory = $true,
                   ValueFromPipeline = $true,
                   ValueFromPipelineByPropertyName = $true)] [object[]] $InputObject,
        [Parameter()] [scriptblock] $Process,
        [Parameter()] [int] $ElementsPerChunk
    )

    Begin {
        $cache = @();
        $index = 0;
    }

    Process {
        foreach($o in $InputObject) {
            $current_element = $o;
            if($Process) {
                $current_element = & $Process $current_element;
            }
            if($cache.Length -eq $ElementsPerChunk) {
                ,$cache;
                $cache = @($current_element);
                $index = 1;
            }
            else {
                $cache += $current_element;
                $index++;
            }
        }
    }

    End {
        if($cache) {
            ,$cache;
        }
    }
}


(Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$_ + 100} -ElementsPerChunk 3)
Write-Host "------------------------------------------------"
(echo 1 2 3 4 5 6 7 | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3)

The result is:

PS C:\Users\a> C:\Untitled5.ps1
100
100
100
100
100
100
100
------------------------------------------------
101
102
103
104
105
106
107

PS C:\Users\a> 

As you can see, it works with piped in objects, but does not work with values get from parameter. How to modify the code to make it work in both cases?

Was it helpful?

Solution

The difference is that when you pipe the array into Chunk-Object, the function executes the process block once for each element in the array passed as a sequence of pipeline objects, whereas when you pass the array as an argument to the -InputObject parameter, the process block executes once for the entire array, which is assigned as a whole to $InputObject.

So, let's take a look at your pipeline version of the command:

echo 1 2 3 4 5 6 7 | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3

The reason this one works is that for each iteration of the pipeline, $_ is set to the value of the current array element in the pipeline, which is also assigned to the $InputObject variable (as a single-element array, due to the [object[]] typecast. The foreach loop is actually extraneous in this case, because the $InputObject array always has a single element for each invocation of the process block. You could actually remove the loop and change $current_element = $o to $current_element = $InputObject, and you'd get the exact same results.

Now, let's examine the version that passes an array argument to -InputObject:

Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$_ + 100} -ElementsPerChunk 3

The reason this doesn't work is that the scriptblock you're passing to the -Process parameter contains $_, but the foreach loop assigns each element to $o, and $_ isn't defined anywhere. All elements in the results are 100 because each iteration sets $current_element to the results of the scriptblock {$_ + 100}, which always evaluates to 100 when $_ is null. To prove this out, try changing $_ in the scriptblock to $o, and you'll get the expected results:

Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$o + 100} -ElementsPerChunk 3

If you want to be able to use $_ in the scriptblock, change the foreach loop to a pipeline, by simply replacing foreach($o in $InputObject) { with $InputObject | %{. That way both versions will work, because the Chunk-Object function uses a pipeline internally, so $_ is set sequentially to each element of the array, regardless of whether the process block is invoked multiple times for a series of individual array elements passed in as pipeline input, or just once for a multiple-element array.


UPDATE:

I looked at this again and noticed that in the line

$current_element = & $Process $current_element;

you appear to be trying to pass $current_element as an argument to the scriptblock in $Process. This doesn't work because parameters passed to a scriptblock work largely the same as in functions. If you invoke MyFunction 'foo', then 'foo' isn't automatically assigned to $_ within the function; likewise, & {$_ + 100} 'foo' doesn't set $_ to 'foo'. Change your scriptblock argument to {$args[0] + 100}, and you'll get the expected results with or without passing in pipeline input:

Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$args[0] + 100} -ElementsPerChunk 3

Note that although this version of the scriptblock argument works even if you keep the foreach loop, I'd still recommend using Foreach-Object ($InputObject | %{), because it's generally more efficient, so the function will run faster for large amounts of data.

OTHER TIPS

The issue isn't technically the parameter attributes. It's both with your arguments, and how you're processing them.

Problem: (echo 1 2 3 4 5 6 7) creates a string of value "1 2 3 4 5 6 7", you appear to want to process an array

Solution: use an array: @(1, 2, 3, 4, 5, 6, 7)

Problem: You are using a foreach statement. This does batch processing, not pipeline

Solution: Use foreach-object

Process {
    $InputObject | Foreach-Object {
        ...
    }
}

foreach($foo in $bar) will gather all items, then iterate. $list | Foreach-Object { ... } processes each item separately, allowing the pipeline to continue

Note: If the input is actually a string, you will also have to split the string, and convert each element to an integer; Alternatively, change the argument type to an integer if that is what you expect.

Final answer:

function Chunk-Object()
{
    [CmdletBinding()]
    Param(
        [Parameter(Mandatory = $true,
                   ValueFromPipeline = $true,
                   ValueFromPipelineByPropertyName = $true)] [object[]] $InputObject,
        [Parameter()] [scriptblock] $Process,
        [Parameter()] [int] $ElementsPerChunk
    )

    Begin {
        $cache = @();
        $index = 0;
    }

    Process {
        $InputObject | ForEach-Object {
            $current_element = $_;
            if($Process) {
                $current_element = & $Process $current_element;
            }
            if($cache.Length -eq $ElementsPerChunk) {
                ,$cache;
                $cache = @($current_element);
                $index = 1;
            }
            else {
                $cache += $current_element;
                $index++;
            }
        }
    }

    End {
        if($cache) {
            ,$cache;
        }
    }
}


Set-PSDebug -Off
Write-Host "Input Object is array"
Chunk-Object -InputObject @(1, 2, 3, 4, 5, 6, 7) -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input Object is on pipeline"
@(1, 2, 3, 4, 5, 6, 7) | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is string"
(echo "1 2 3 4 5 6 7")  | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is split string"
(echo "1 2 3 4 5 6 7") -split ' ' | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is int[] converted from split string"
([int[]]("1 2 3 4 5 6 7" -split ' '))  | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is split and converted"
(echo "1 2 3 4 5 6 7") -split ' ' | Chunk-Object -Process {[int]$_ + 100} -ElementsPerChunk 3

PowerShell automatically unwraps objects that are piped in, hence the difference in behavior.

Consider the following code:

function Test {
    [CmdletBinding()]
    param (
        [Parameter(ValueFromPipeline = $true)]
        [Object[]] $InputObject
    )

    process {
        $InputObject.Count;
    }
}

# This example shows how the single array is passed
# in, containing 4 items.
Test -InputObject (1,2,3,4);

# Result: 4

# This example shows how PowerShell unwraps the
# array and treats each object individually.
1,2,3,4 | Test;

# Result: 1,1,1,1

With this in mind, we have to treat the input differently, depending on how it's being passed in.

function Test {
    [CmdletBinding()]
    param (
        [Parameter(ValueFromPipeline = $true)]
        [Object[]] $InputObject
        , [ScriptBlock] $Process
    )

    process {
        if ($InputObject.Count -gt 1) {
            foreach ($Object in $InputObject) {
                Invoke-Command -ScriptBlock $Process -ArgumentList $Object;
            }
        }
        else {
            Invoke-Command -ScriptBlock $Process -ArgumentList $InputObject;
        }
    }
}

Test -InputObject (1,2,3,4) -Process { $args[0] + 100 };

Write-Host -Object '-----------------';

1,2,3,4 | Test -Process { $args[0] + 100; };

If you want the user to be able to use $_ instead of $args[0], then you'll have to make sure that the user of the function includes a process { ... } block inside of their ScriptBlock. See the following example.

function Test {
    [CmdletBinding()]
    param (
        [Parameter(ValueFromPipeline = $true)]
        [Object[]] $InputObject
        , [ScriptBlock] $Process
    )

    process {
        if ($InputObject.Count -gt 1) {
            foreach ($Object in $InputObject) {
                $Object | & $Process;
            }
        }
        else {
            $_ | & $Process;
        }
    }
}

Test -InputObject (1,2,3,4) -Process { process { $_ + 100; }; };

Write-Host -Object '-----------------';

1,2,3,4 | Test -Process { process { $_ + 100; }; };

Instead of using $Inputobject, try giving it a parameter name like $Input. here's a sample function I use for teaching that explains how:

Function Get-DriveC {
[cmdletbinding()]

Param(
[Parameter(ValueFromPipeline)]
[ValidateNotNullorEmpty()]
[string[]]$Computername = $env:computername)

Begin {
    Write-Verbose "Starting Get-DriveC"
    #define a hashtable of parameters to splat
    $param=@{Computername=$null;class="win32_logicaldisk";errorAction="Stop";
    filter="deviceid='c:'"}
}
Process {
foreach ($computer in $computername) {
  Try {
   Write-Verbose "Querying $computer"
   $param.Computername=$computer
   Get-CimInstance @param
  }
  Catch {
    Write-Warning "Oops. $($_.exception.message)"
  }
} #foreach
} #process

End {
    Write-Verbose "Ending Get-DriveC"
 }

} #end function

I can pipe computer names to it, or pass an array as a parameter value.

InputObject I believe is a reserved word. You can use it but I think you might have to set it up in a different parameter set.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top