String search with Turkish dotless i

Tags:

When searching the text Çınaraltı Café for the text Ci using the code

NSStringCompareOptions options =
    NSCaseInsensitiveSearch |
    NSDiacriticInsensitiveSearch |
    NSWidthInsensitiveSearch;
NSLocale *locale = [NSLocale localeWithLocaleIdentifier:@"tr"];
NSRange range = [haystack rangeOfString:needle 
                                options:options
                                  range:NSMakeRange(o, haystack.length)
                                 locale:locale];

I get range.location equals NSNotFound.

It's not to do with the diacritic on the initial Ç because I get the same result searching for alti where the only odd character is the ı. I also get a valid match searching for Cafe which contains a diacritic (the é).

The apple docs mention this situation as notes on the locale parameter and I think I'm following them. Though I guess I'm not because it's not working.

How can I get a search for 'i' to match both 'i' and 'ı'?

997

asked Jul 08 '13 22:07

deanWombourne

1 Answers

I don't know whether this helps as an answer, but perhaps explains why it's happening.

I should point out I'm not an expert in this matter, but I've been looking into this for my own purposes and been doing some research.

Looking at the Unicode collation chart for latin, the equivalent characters to ASCII "i" (\u0069) do not include "ı" (\u0131), whereas all the other letters in your example string are as you expect, i.e.:

"c" (\u0063) does include "Ç" (\u00c7)
"e" (\u0065) does include "é" (\u00e9)

The ı character is listed separately as being of primary difference to i. That might not make sense to a Turkish speaker (I'm not one) but it's what Unicode have to say about it, and it does fit the logic of the problem you describe.

In Chrome you can see this in action with an in-page search. Searching in the page for ASCII i highlights all the characters in its block and does not match ı. Searching for ı does the opposite.

By contrast, MySQL's utf8_general_ci collation table maps uppercase ASCII I to ı as you want.

So, without knowing anything about iOS, I'm assuming it's using the Unicode standard and normalising all characters to latin by this table.

As to how you match Çınaraltı with Ci - if you can't override the collation table then perhaps you can just replace i in your search strings with a regular expression, so you search on Ç[iı] instead.

161

answered Nov 03 '22 02:11

Tim

Related questions
                            
                                how safe is it to mix ARC and non-ARC code in the same iOS project?
                            
                                Get Code Coverage Statistics IOS
                            
                                Never getting uploading data progress in google API for ios
                            
                                How to draw line given a center point and angle in iOS?
                            
                                AVAssetWriterInput H.264 Passthrough to QuickTime (.mov) - Passing in SPS/PPS to create avcC atom?
                            
                                Presenting a view controller with transparency and animation
                            
                                How long are the vendor and advertising IDs that have replaced iPhone UDIDs?
                            
                                nil resultURL after successfully posting with FBWebDialogs (Facebook SDK 3.5)
                            
                                Programmatically check iMessage support of contact
                            
                                How do I detect an authentication challenge from a UIWebView?
                            
                                What is `inputProcRefCon` in the `AURenderCallbackStruct`?
                            
                                Reset appearance settings for UINavigationBar back to default
                            
                                iOS CommonCrypto reference [closed]
                            
                                Method swizzling for "alloc"?
                            
                                How to parse an M3U8 file in Objective C? [closed]
                            
                                CGContextAddArc counterclockwise instead of clockwise
                            
                                UIBezierPath simple rectangle
                            
                                Convert CLLocation or CLLocationCoordinate2D to CGPoint
                            
                                Finding out NSArray/NSMutableArray changes' indices
                            
                                GMSMapView myLocation not giving actual location

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

String search with Turkish dotless i

Tags:

ios

objective-c

nsstring

localization

turkish

deanWombourne

People also ask

1 Answers

Tim

Recent Activity

Donate For Us