Is it possible in PowerShell, to truncate a string, (using SubString()
?), to a given maximum number of characters, even if the original string is already shorter?
For example:
foreach ($str in "hello", "good morning", "hi") { $str.subString(0, 4) }
The truncation is working for hello
and good morning
, but I get an error for hi
.
I would like the following result:
hell
good
hi
You need to evaluate the current item and get the length of it. If the length is less than 4 then use that in the substring function.
foreach ($str in "hello", "good morning", "hi") {
$str.subString(0, [System.Math]::Min(4, $str.Length))
}
Or you could just keep it simple, using PowerShell's alternative to a ternary operator:
foreach ($str in "hello", "good morning", "hi") {
$(if ($str.length -gt 4) { $str.substring(0, 4) } else { $str })
}
While all the other answers are "correct", their efficiencies go from sub-optimal to potentially horrendous. The following is not a critique of the other answers, but it is intended as an instructive comparison of their underlying operation. After all, scripting is more about getting it running soon than getting it running fast.
In order:
foreach ($str in "hello", "good morning", "hi") {
$str.subString(0, [System.Math]::Min(4, $str.Length))
}
This is basically the same as my offering except that instead of just returning $str when it is too short, we call substring and tell it to return the whole string. Hence, sub-optimal. It is still doing the if..then..else but just inside Min, vis.
if (4 -lt $str.length) {4} else {$str.length}
foreach ($str in "hello", "good morning", "hi") { $str -replace '(.{4}).+','$1' }
Using regular expression matching to grab the first four characters and then replace the whole string with them means that the entire (possibly very long) string must be scanned by the matching engine of unknown complexity/efficiency.
While a person can see that the '.+' is simply to match the entire remainder of the string, the matching engine could be building up a large list of backtracking alternatives since the pattern is not anchored (no ^ at the begining). The (not described) clever bit here is that if the string is less than five characters (four times .
followed by one or more .
) then the whole match fails and replace returns $str unaltered.
foreach ($str in "hello", "good morning", "hi") {
try {
$str.subString(0, 4)
}
catch [ArgumentOutOfRangeException] {
$str
}
}
Deliberately throwing exceptions instead of programmatic boundary checking is an interesting solution, but who knows what is going on as the exception bubbles up from the try block to the catch. Probably not much in this simple case, but it would not be a recommended general practice except in situations where there are many possible sources of errors (making it cumbersome to check for all of them), but only a few responses.
Interestingly, an answer to a similar question elsewhere using -join
and array slices (which don't cause errors on index out of range, just ignore the missing elements):
$str[0..3] -join "" # Infix
(or more simply)
-join $str[0..3] # Prefix
could be the most efficient (with appropriate optimisation) given the strong similarity between the storage of string
and char[]
. Optimisation would be required since, by default, $str[0..3] is an object[], each element being a single char, and so bears little resemblance to a string (in memory). Giving PowerShell a little hint could be useful,
-join [char[]]$str[0..3]
However, maybe just telling it what you actually want,
new-object string (,$str[0..3]) # Need $str[0..3] to be a member of an array of constructor arguments
thereby directly invoking
new String(char[])
is best.
You could trap the exception:
foreach ($str in "hello", "good morning", "hi") {
try {
$str.subString(0, 4)
}
catch [ArgumentOutOfRangeException] {
$str
}
}
More regex love, using lookbehind:
PS > 'hello','good morning','hi' -replace '(?<=(.{4})).+'
hell
good
hi
I'm late to the party as always! I have used the PadRight string function to address such an issue. I cannot comment on its relative efficiency compared to the other suggestions:
foreach ($str in "hello", "good morning", "hi") { $str.PadRight(4, " ").SubString(0, 4) }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With