I'm running
'S-tst','ssrst','srst2','s-zaa','s-a','s-zf' | Sort-Object
Shouldn't I have gotten a return of
s-a
S-tst
s-zaa
s-zf
srst2
ssrst
but instead I get the following:
s-a
srst2
ssrst
S-tst
s-zaa
s-zf
How is this possible ? Does sort-object only look at letters when sorting out ? Is there any way to sort it out by special characters ?
This behaviour is by design, but not always what people want/expect. If you want strings sorted with each character in ASCII order use this:
Add-Type @"
using System;
using System.Collections;
using System.Collections.Generic;
using System.Globalization;
public class SimpleStringComparer: IComparer, IComparer<string>
{
private static readonly CompareInfo compareInfo = CompareInfo.GetCompareInfo(CultureInfo.InvariantCulture.Name);
public int Compare(object x, object y)
{
return Compare(x as string, y as string);
}
public int Compare(string x, string y)
{
return compareInfo.Compare(x, y, CompareOptions.OrdinalIgnoreCase);
}
public SimpleStringComparer() {}
}
"@
[string[]]$myList = 's-a','s-a1','s''a','s''a1', 'sa','sa1','s^a','S-a','S-a1','S''a','S''a1', 'Sa','Sa1','S^a'
[System.Collections.Generic.List[string]]$list = [System.Collections.Generic.List[string]]::new()
$list.AddRange($myList)
[SimpleStringComparer]$comparer = [SimpleStringComparer]::new()
$list.Sort([SimpleStringComparer]::new())
$list
Outputs:
s'a
S'a
s'a1
S'a1
s-a
S-a
s-a1
S-a1
sa
Sa
sa1
Sa1
s^a
S^a
More Info
Per @TessellatingHeckler in the comments, you can sort strings in character code (ordinal) order by casting the string to a char array. However, that still handles hyphens and apostrophes in a potentially unexpected way (as these characters are ignored):
$myList = 's-a','s-a1','s''a','s''a1', 'sa','sa1','s^a','S-a','S-a1','S''a','S''a1', 'Sa','Sa1','S^a'
$myList | Sort-Object -Property { [char[]] $_ }
s'a
S'a
s-a
S-a
s'a1
S'a1
s-a1
S-a1
s^a
S^a
sa
Sa
sa1
Sa1
The current sorting behaviour is by design. It appears that PowerShell implements a "Word Sort". This is documented here: https://msdn.microsoft.com/en-us/library/windows/desktop/dd318144(v=vs.85).aspx#SortingFunctions
In addition to ignoring hyphens and apostrophes (except when comparing otherwise identical strings), this sort also treats punctuation characters as coming before alphanumerics, and handles accented letters alongside their counterparts. A simple demo of this can be seen like so:
32..255 | %{[string][char][byte]$_} | sort
To define other sorting behaviours, currently you'd likely need to dip into .Net, like so:
Add-Type @"
using System;
using System.Runtime.InteropServices;
using System.Collections;
public class NumericStringComparer: IComparer
{
//https://msdn.microsoft.com/en-us/library/windows/desktop/bb759947%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396
[DllImport("shlwapi.dll")]
public static extern int StrCmpLogicalW(string psz1, string psz2);
public int Compare(object x, object y)
{
return Compare(x as string, y as string);
}
public int Compare(string x, string y)
{
return StrCmpLogicalW(x, y);
}
public NumericStringComparer() {}
}
"@
[System.Collections.ArrayList]$myList = 's-a','s-a1','s''a','s''a1', 'sa','sa1','s^a','S-a','S-a1','S''a','S''a1', 'Sa','Sa1','S^a', , '100a','1a','001a','2a','20a'
$myList.Sort([NumericStringComparer]::new())
$myList -join ', '
The above sorts strings the way Windows Explorer would (i.e. treating leading digits as numeric values):
s'a, s'a1, S'a, s-a, S-a, S-a1, S'a1, s-a1, S^a, s^a, 1a, 001a, 2a, Sa, Sa1, sa, sa1, 20a, 100a
I've submitted a feature suggestion to provide more PS friendly solutions on Sort-Object
. See https://github.com/PowerShell/PowerShell/issues/4098
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With