I taught my self Powershell so I do not know everything about it.
I need to search a database with the exact amount of lines I have put in (the database is predefined), it contains > 11800 entries.
Can you please help me find what is making this slow?
Code:
$Dict = Get-Content "C:\Users\----\Desktop\Powershell Program\US.txt"
if($Right -ne "") {
$Comb = $Letter + $Right
$total = [int]0
$F = ""
do {
$F = $Dict | Select-Object -Index $total
if($F.Length -eq $Num) {
if($F.Chars("0") + $F.Chars("1") -eq $Comb) {
Add-Content "C:\Users\----\Desktop\Powershell Program\Results.txt" "$F"
}
}
$total++
Write-Host $total
} until([int]$total -gt [int]118619)
$total = [int]0
$F = ""
}
How do I speed this line by line searching/matching process up? Do I do by multi-threading? If so how?
It seems like you've known at least one other language before powershell, and are starting out by basically replicating what you might have done in another language in this one. That's a great way to learn a new language, but of course in the beginning you might end up with methods that are a bit strange or not performant.
So first I want to break down what your code is actually doing, as a rough overview:
$Dict
variable.To understand why, you need to know a little bit about pipelines in PowerShell. Cmdlets that accept and work on pipelines take one or more objects, but they process a single object at a time. They don't even have access to the rest of the pipeline.
This is also true for the Select-Object
cmdlet. So when you take an array with 18,500 objects in it, and pipe it into Select-Object -Index 18000
, you need to send in 17,999 objects for inspection/processing before it can give you the one you want. You can see how the time taken would get longer and longer the larger the index is.
Since you already have an array, you directly access any array member by index with square brackets []
like so:
$Dict[18000]
For a given array, that takes the same amount of time no matter what the index is.
Now for a single call to Select-Object -Index
you probably aren't going to notice how long it takes, even with a very large index; the problem is that you're looping through the entire array already, so this is compounding greatly.
You're essentially having to do the sum of 1..18000
which is about or approximately 162,000,000 iterations! (thanks to user2460798 for correcting my math)
I tested this. First, I created an array with 19,000 objects:
$a = 1..19000 | %{"zzzz~$_"}
Then I measured both methods of accessing it. First, with select -index
:
measure-command { 1..19000 | % { $a | select -Index ($_-1 ) } | out-null }
Result:
TotalMinutes : 20.4383861316667
TotalMilliseconds : 1226303.1679
Then with the indexing operator ([]
):
measure-command { 1..19000 | % { $a[$_-1] } | out-null }
Result:
TotalMinutes : 0.00788774666666667
TotalMilliseconds : 473.2648
The results are pretty striking, it takes nearly 2,600 times longer to use Select-Object
.
The above is the single thing causing your major slowdown, but I wanted to point out something else.
Typically in most languages, you would use a for
loop to count. In PowerShell this would look like this:
for ($i = 0; $i -lt $total ; $i++) {
# $i has the value of the iteration
}
In short, there are three statements in the for
loop. The first is an expression that gets run before the loop starts. $i = 0
initializes the iterator to 0
, which is the typical usage of this first statement.
Next is a conditional; this will be tested on each iteration and the loop will continue if it returns true. Here $i -lt $total
compares checks to see that $i
is less than the value of $total
, some other variable defined elsewhere, presumably the maximum value.
The last statement gets executed on each iteration of the loop. $i++
is the same as $i = $i + 1
so in this case we're incrementing $i
on each iteration.
It's a bit more concise than using a do
/until
loop, and it's easier to follow because the meaning of a for
loop is well known.
If you're interested in more feedback about working code you've written, have a look at Code Review. Please read the rules there carefully before posting.
To my surprise using the array GetEnumerator is faster than indexing. It takes about 5/8 of the time of indexing. However this test is pretty unrealistic, in that the body of each loop is about as small as it can be.
$size = 64kb
$array = new int[] $size
# Initializing the array takes quite a bit of time compared to the loops below
0..($size-1) | % { $array[$_] = get-random}
write-host `n`nMeasure using indexing
[uint64]$sum = 0
Measure-Command {
for ($ndx = 0; $ndx -lt $size; $ndx++) {
$sum += $array[$ndx]
}
}
write-host Average = ($sum / $size)
write-host `n`nMeasure using array enumerator
[uint64]$sum = 0
Measure-Command {
foreach ($element in $array.GetEnumerator()) {
$sum += $element
}
}
write-host Average = ($sum / $size)
Measure using indexing
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 898
Ticks : 8987213
TotalDays : 1.04018668981481E-05
TotalHours : 0.000249644805555556
TotalMinutes : 0.0149786883333333
TotalSeconds : 0.8987213
TotalMilliseconds : 898.7213
Average = 1070386366.9346
Measure using array enumerator
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 559
Ticks : 5597112
TotalDays : 6.47813888888889E-06
TotalHours : 0.000155475333333333
TotalMinutes : 0.00932852
TotalSeconds : 0.5597112
TotalMilliseconds : 559.7112
Average = 1070386366.9346
Code for these two in assembler might look like
; Using Indexing
mov esi, <addr of array>
xor ebx, ebx
lea edi, <addr of $sum>
loop:
mov eax, dword ptr [esi][ebx*4]
add dword ptr [edi], eax
inc ebx
cmp ebx, 65536
jl loop
; Using enumerator
mov esi, <addr of array>
lea edx, [esi + 65356*4]
lea edi, <addr of $sum>
loop:
mov eax, dword ptr [esi]
add dword ptr [edi], eax
add esi, 4
cmp esi, edx
jl loop
The only difference is in the first mov
instruction in the loop, with one using an index register and the other not. I kind of doubt that would explain the observed difference in speed. I guess the JITter must add additional overhead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With