Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

segmenting list of values

Tags:

powershell

I'm writing some code to figure out how to segment a list of values into more manageable chunks. the reason I want to do this is because I'll have about 100,000 live values and I want to minimise the failure risk.


$wholeList = 1..100

$nextStartingPoint

$workingList

function Get-NextTenItems()
{
    $workingList = (1+$nextStartingPoint)..(10+$nextStartingPoint)

    $nextStartingPoint += 10

    Write-Host "inside Get-NextTenItems"
    write-host "Next StartingPoint: $nextStartingPoint"
    $workingList
    Write-Host "exiting Get-NextTenItems"
}

function Write-ListItems()
{
    foreach ($li in $workingList)
    {
        Write-Host $li
    }
}
Get-NextTenItems
Write-ListItems
Get-NextTenItems
Write-ListItems

I ran the code in the PowerGUI debugger and I've noticed my $nextStartingPoint is resetting to 0 when I exit the Get-NextTenItems function.

Why is this happening and how can I prevent it?

Should I also assume that the same thing is happening to $workingList?

like image 225
Mike Avatar asked Jan 24 '26 08:01

Mike


1 Answers

My suggestion is to use pipelines. One function produces chunks and the other consumes them.

With this approach you don't need to pollute global/script scope which is not a good idea. Everything needed is kept in the function that needs it.

function Get-Chunk
{
    param(
        [Parameter(Mandatory=$true)]$collection, 
        [Parameter(Mandatory=$false)]$count=10
    )
    #temporary array
    $tmp = @()
    $i = 0
    foreach($c in $collection) {
        $tmp += $c                  # add item $c to array
        $i++                        # increase counter; indicates that we reached chunk size
        if ($i -eq $count) {
            ,$tmp                   # send the temporary array to the pipeline
            $i = 0                  # reset variables
            $tmp = @()         
        }
    }
    if ($tmp)  {                    # if there is something remaining, send it to the pipeline
        ,$tmp
    }
}

function Write-ListItems
{
    param(
        [Parameter(Mandatory=$true, ValueFromPipeline=$true)]$chunk
    )
    process {
        write-host Chunk: "$chunk"
    }
}

Test the functions:

$wholeList = 1..100
Get-Chunk $wholeList | Write-ListItems
Chunk: 1 2 3 4 5 6 7 8 9 10
Chunk: 11 12 13 14 15 16 17 18 19 20
Chunk: 21 22 23 24 25 26 27 28 29 30
Chunk: 31 32 33 34 35 36 37 38 39 40
Chunk: 41 42 43 44 45 46 47 48 49 50
Chunk: 51 52 53 54 55 56 57 58 59 60
Chunk: 61 62 63 64 65 66 67 68 69 70
Chunk: 71 72 73 74 75 76 77 78 79 80
Chunk: 81 82 83 84 85 86 87 88 89 90
Chunk: 91 92 93 94 95 96 97 98 99 100

Get-Chunk $wholeList 32 | Write-ListItems
Chunk: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Chunk: 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
Chunk: 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
Chunk: 97 98 99 100

Update

I added some comments to clarify things up. Note, that when sending content to pipeline (a) I don't use return, because I would jump from the function away. (b) the comma at the beginning wraps content of $tmp to array, so it creates new array with one item (which is array of N items). Why? Because in PowerShell there is automatic unrolling, that would unwrap the items from the array and would flatten all the items -> the result would be one big array again.

Example:

function Get-Arrays {
  1,2,3
  "a", "b"
  ,(get-date)
  4,5,6
}
Get-Arrays | % { "$_" }

This works as expected:

function Get-Arrays {
  ,(1,2,3)
  ,("a", "b")
  ,(get-date)
  ,(4,5,6)
}
Get-Arrays | % { "$_" }
like image 86
stej Avatar answered Jan 26 '26 02:01

stej



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!