Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String.Contains and String.LastIndexOf C# return different result?

I have this problem where String.Contains returns true and String.LastIndexOf returns -1. Could someone explain to me what happened? I am using .NET 4.5.

    static void Main(string[] args)
    {
        String wikiPageUrl = @"http://it.wikipedia.org/wiki/ʿAbd_Allāh_al-Sallāl";

        if (wikiPageUrl.Contains("wikipedia.org/wiki/"))
        {

            int i = wikiPageUrl.LastIndexOf("wikipedia.org/wiki/");

            Console.WriteLine(i);

        }
    }
like image 608
user3430150 Avatar asked Sep 02 '14 00:09

user3430150


People also ask

How do you check if a string contains a string in C?

The function strstr returns the first occurrence of a string in another string. This means that strstr can be used to detect whether a string contains another string.

What does the string method lastIndexOf string returns?

The lastIndexOf() method returns the position of the last occurrence of specified character(s) in a string.

How do I check if a string contains?

The contains() method checks whether a string contains a sequence of characters. Returns true if the characters exist and false if not.

How do you check if a string contains a character C#?

To check if a string str contains specified character value , or say if specified character is present in the string, use C# String. Contains(Char) method. Call Contains() method on the string str and pass the character value as argument. Contains() method returns True if str contains value .


1 Answers

While @sa_ddam213's answer definitely fixes the problem, it might help to understand exactly what's going on with this particular string.

If you try the example with other "special characters," the problem isn't exhibited. For example, the following strings work as expected:

string url1 = @"http://it.wikipedia.org/wiki/»Abd_Allāh_al-Sallāl";
Console.WriteLine(url1.LastIndexOf("it.wikipedia.org/wiki/")); // 7

string url2 = @"http://it.wikipedia.org/wiki/~Abd_Allāh_al-Sallāl";
Console.WriteLine(url2.LastIndexOf("it.wikipedia.org/wiki/")); // 7

The character in question, "ʿ", is called a spacing modifier letter1. A spacing modifier letter doesn't stand on its own, but modifies the previous character in the string, this case a "/". Another way to put this is that it doesn't take up its own space when rendered.

LastIndexOf, when called with no StringComparison argument, compares strings using the current culture.

When strings are compared in a culture-sensitive manner, the "/" and "ʿ" characters are not seen as two distinct characters--they're processed into one character, which does not match the parameter passed in to LastIndexOf.

When you pass in StringComparison.Ordinal to LastIndexOf, the characters are treated as distinct, due to the nature of Ordinal comparison.

Another way to make this work would be to use CompareInfo.LastIndexOf and supply the CompareOptions.IgnoreNonSpace option:

Console.WriteLine(
    CultureInfo.CurrentCulture.CompareInfo.LastIndexOf(
        wikiPageUrl, @"it.wikipedia.org/wiki/", CompareOptions.IgnoreNonSpace));
// 7

Here we're saying that we don't want combining characters included in our string comparison.

As a sidenote, this means that @Partha's answer and @Noctis' answer only work because the character is being applied to a character that doesn't appear in the search string that's passed to LastIndexOf.

Contrast this with the Contains method, which by default performs an Ordinal (case sensitive and culture insensitive) comparison. This explains why Contains returns true and LastIndexOf returns false.

For a fantastic overview of how strings should be manipulated in the .NET framework, check out this article.


1: Is this different than a combining character or is it a type of combining character? would appreciate if someone would clear that up for me.

like image 126
Andrew Whitaker Avatar answered Oct 13 '22 12:10

Andrew Whitaker