I ran into this example where s1 < s2 and s2 < s3 but (s1 < s3) is false: <pre class="prettyprint"><code>var str1 = "あいかぎ" var str2 = "あいかくしつ" var str3 = "あいがみ:" print(str1 < str2) // True print(str2 < str3) // True print(str1 < str3) // False (?) </code></pre> Is this a bug or it is true that we cannot rely on string comparison is transitive (this breaks my sorting of string array)? I'm running Swift 3. Update: all of these are False <pre class="prettyprint"><code>print(str1 < str3) // False (?) print(str1 == str3) // False (?) print(str1 > str3) // False (?) </code></pre> So some strings are not comparable with each other? Update: a comment in How does the Swift string more than operator work pointed out that the source code for < operator is in https://github.com/apple/swift/blob/master/stdlib/public/core/String.swift, and the comparison is handled by <code>_swift_stdlib_unicode_compare_utf8_utf8</code> in https://github.com/apple/swift/blob/master/stdlib/public/stubs/UnicodeNormalization.cpp Update: These are true <pre class="prettyprint"><code>print(str1 >= str3) // True print(str1 <= str3) // True </code></pre> Update: there is an issue with <code>String.localizedCompare()</code> too. There are two strings where s1 = s2 but s2 > s1: <pre class="prettyprint"><code>str1 = "bảo toàn" str2 = "bảo tồn" print(str1.localizedCompare(str2) == .orderedSame) // true print(str2.localizedCompare(str1) == .orderedDescending) // true </code></pre>

It looks like this is not supposed to happen: <blockquote> Q: Is transitive consistency maintained by the [Unicode Collation Algorithm]? A: Yes, for any strings A, B, and C, if A < B and B < C, then A < C. However, implementers must be careful to produce implementations that accurately reproduce the results of the Unicode Collation Algorithm as they optimize their own algorithms. It is easy to perform careless optimizations — especially with Incremental Comparison algorithms — that fail this test. Other items to check are the proper distinction between the bases of accents. For example, the sequence <u-macron, u-diaeresis-macron> should compare as less than <u-macron-diaeresis, u-macron>; this is a secondary distinction, based on the weighting of the accents, which must be correctly associated with the primary weights of their respective base letters. </blockquote> (Source: Unicode Collation FAQ) In the <code>UnicodeNormalization.cpp</code> file, <code>ucol_strcoll</code> and <code>ucol_strcollIter</code> are called, which are part of the ICU project. This may be a bug in the Swift standard library or the ICU project. I reported this issue to the Swift Bug Tracker.

String comparison in Swift is not transitive

Tags:

swift

I ran into this example where s1 < s2 and s2 < s3 but (s1 < s3) is false:

var str1 = "あいかぎ"
var str2 = "あいかくしつ"
var str3 = "あいがみ:"

print(str1 < str2)       // True
print(str2 < str3)       // True
print(str1 < str3)       // False (?)

Is this a bug or it is true that we cannot rely on string comparison is transitive (this breaks my sorting of string array)? I'm running Swift 3.

Update: all of these are False

print(str1 < str3)       // False (?)
print(str1 == str3)       // False (?)
print(str1 > str3)       // False (?)

So some strings are not comparable with each other?

Update: a comment in How does the Swift string more than operator work pointed out that the source code for < operator is in https://github.com/apple/swift/blob/master/stdlib/public/core/String.swift, and the comparison is handled by _swift_stdlib_unicode_compare_utf8_utf8 in https://github.com/apple/swift/blob/master/stdlib/public/stubs/UnicodeNormalization.cpp

Update: These are true

print(str1 >= str3)  // True
print(str1 <= str3)  // True

Update: there is an issue with String.localizedCompare() too. There are two strings where s1 = s2 but s2 > s1:

str1 = "bảo toàn"
str2 = "bảo tồn"

print(str1.localizedCompare(str2) == .orderedSame) // true
print(str2.localizedCompare(str1) == .orderedDescending) // true

433

asked Sep 15 '17 01:09

Pinch

1 Answers

It looks like this is not supposed to happen:

Q: Is transitive consistency maintained by the [Unicode Collation Algorithm]?

A: Yes, for any strings A, B, and C, if A < B and B < C, then A < C. However, implementers must be careful to produce implementations that accurately reproduce the results of the Unicode Collation Algorithm as they optimize their own algorithms. It is easy to perform careless optimizations — especially with Incremental Comparison algorithms — that fail this test. Other items to check are the proper distinction between the bases of accents. For example, the sequence <u-macron, u-diaeresis-macron> should compare as less than <u-macron-diaeresis, u-macron>; this is a secondary distinction, based on the weighting of the accents, which must be correctly associated with the primary weights of their respective base letters.

(Source: Unicode Collation FAQ)

In the UnicodeNormalization.cpp file, ucol_strcoll and ucol_strcollIter are called, which are part of the ICU project. This may be a bug in the Swift standard library or the ICU project. I reported this issue to the Swift Bug Tracker.

169

answered Sep 26 '22 11:09

Palle

Related questions
                            
                                Overlaying an iAd banner onto screen instead of resizing screen
                            
                                What does "Arg = Exploded" mean in Swift crash log? [duplicate]
                            
                                Swift 2.0 Random EXC_BAD_ACCESS
                            
                                Include -Swift.h in the umbrella header
                            
                                WKWebView content jumping on load
                            
                                SKVideoNode only on a small part of SCNSphere
                            
                                Why does using dynamicType on a force unwrapped nil optional value type work?
                            
                                Google Autocomplete function Crash after Call
                            
                                proximityMonitoring may not be working as intended
                            
                                Argument labels before parameters in functions - possible bug in Swift Markup?
                            
                                Swift: How to access player object of full screen HTML video?
                            
                                POST image with swift client from generated swagger code
                            
                                MFMailComposeViewController error [MC] Filtering mail sheet accounts for bundle ID
                            
                                CloudKit and Core sync data between devices
                            
                                UICollectionView dynamic cell height (Pinterest layout)
                            
                                How to get cached responses using Alamofire while app is in offline?
                            
                                Using CoreMIDI input with AVAudioUnit
                            
                                Render MKMapView offscreen
                            
                                Add links to Swift classes in the quick help documentation comments?
                            
                                Get detailed iOS CPU usage with different states

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With